apply and lapply {pbdMPI} | R Documentation |
The functions are parallel versions of apply and lapply functions.
pbdApply(X, MARGIN, FUN, ..., pbd.mode = c("mw", "spmd", "dist"), rank.source = .pbd_env$SPMD.CT$rank.root, comm = .pbd_env$SPMD.CT$comm, barrier = TRUE) pbdLapply(X, FUN, ..., pbd.mode = c("mw", "spmd", "dist"), rank.source = .pbd_env$SPMD.CT$rank.root, comm = .pbd_env$SPMD.CT$comm, bcast = FALSE, barrier = TRUE) pbdSapply(X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE, pbd.mode = c("mw", "spmd", "dist"), rank.source = .pbd_env$SPMD.CT$rank.root, comm = .pbd_env$SPMD.CT$comm, bcast = FALSE, barrier = TRUE)
X |
a matrix or array in |
MARGIN |
|
FUN |
as in the |
... |
optional arguments to |
simplify |
as in the |
USE.NAMES |
as in the |
pbd.mode |
mode of distributed data |
rank.source |
a rank of source where |
comm |
a communicator number. |
bcast |
if bcast to all ranks. |
barrier |
if barrier for all ranks. |
All functions are majorly called in manager/workers mode
(pbd.model = "mw"
), and just work the same as their
serial version.
If pbd.mode = "mw"
, the X
in rank.source
(master)
will be redistributed to processors (workers), then apply FUN
on the new data, and results are gathered to rank.source
.
“In SPMD, master is one of workers.”
...
is also scatter()
from rank.source
.
If pbd.mode = "spmd"
, the same copy of X
is supposed to
exist in all processors, and original apply()
, lapply()
,
or sapply()
is operated on part of
X
. An allgather()
or gather()
call is required to
aggregate results manually.
If pbd.mode = "dist"
, the different X
is supposed to
exists in all processors, i.e. ‘distinct or distributed’ X
,
and original apply()
, lapply()
, or sapply()
is operated
on the all X
. An allgather()
or gather()
call is
required to aggregate results manually.
In SPMD, it is better to split data into pieces, and X
is a local
matrix in all processors. Originally, apply()
should be sufficient
in this case.
A list or matrix will be returned.
Wei-Chen Chen wccsnow@gmail.com, George Ostrouchov, Drew Schmidt, Pragneshkumar Patel, and Hao Yu.
Programming with Big Data in R Website: http://r-pbd.org/
## Not run: ### Save code in a file "demo.r" and run with 2 processors by ### SHELL> mpiexec -np 2 Rscript demo.r ### Initial. suppressMessages(library(pbdMPI, quietly = TRUE)) init() .comm.size <- comm.size() .comm.rank <- comm.rank() ### Example for pbdApply. N <- 100 x <- matrix((1:N) + N * .comm.rank, ncol = 10) y <- pbdApply(x, 1, sum, pbd.mode = "mw") comm.print(y) y <- pbdApply(x, 1, sum, pbd.mode = "spmd") comm.print(y) y <- pbdApply(x, 1, sum, pbd.mode = "dist") comm.print(y) ### Example for pbdApply for 3D array. N <- 60 x <- array((1:N) + N * .comm.rank, c(3, 4, 5)) dimnames(x) <- list(lat = paste("lat", 1:3, sep = ""), lon = paste("lon", 1:4, sep = ""), time = paste("time", 1:5, sep = "")) comm.print(x[,, 1:2]) y <- pbdApply(x, c(1, 2), sum, pbd.mode = "mw") comm.print(y) y <- pbdApply(x, c(1, 2), sum, pbd.mode = "spmd") comm.print(y) y <- pbdApply(x, c(1, 2), sum, pbd.mode = "dist") comm.print(y) ### Example for pbdLapply. N <- 100 x <- split((1:N) + N * .comm.rank, rep(1:10, each = 10)) y <- pbdLapply(x, sum, pbd.mode = "mw") comm.print(unlist(y)) y <- pbdLapply(x, sum, pbd.mode = "spmd") comm.print(unlist(y)) y <- pbdLapply(x, sum, pbd.mode = "dist") comm.print(unlist(y)) ### Finish. finalize() ## End(Not run)