| apply and lapply {pbdMPI} | R Documentation |
The functions are parallel versions of apply and lapply functions.
pbdApply(X, MARGIN, FUN, ..., pbd.mode = c("mw", "spmd", "dist"),
rank.source = .pbd_env$SPMD.CT$rank.root,
comm = .pbd_env$SPMD.CT$comm,
barrier = TRUE)
pbdLapply(X, FUN, ..., pbd.mode = c("mw", "spmd", "dist"),
rank.source = .pbd_env$SPMD.CT$rank.root,
comm = .pbd_env$SPMD.CT$comm,
bcast = FALSE, barrier = TRUE)
pbdSapply(X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE,
pbd.mode = c("mw", "spmd", "dist"),
rank.source = .pbd_env$SPMD.CT$rank.root,
comm = .pbd_env$SPMD.CT$comm,
bcast = FALSE, barrier = TRUE)
X |
a matrix or array in |
MARGIN |
|
FUN |
as in the |
... |
optional arguments to |
simplify |
as in the |
USE.NAMES |
as in the |
pbd.mode |
mode of distributed data |
rank.source |
a rank of source where |
comm |
a communicator number. |
bcast |
if bcast to all ranks. |
barrier |
if barrier for all ranks. |
All functions are majorly called in manager/workers mode
(pbd.model = "mw"), and just work the same as their
serial version.
If pbd.mode = "mw", the X in rank.source (master)
will be redistributed to processors (workers), then apply FUN
on the new data, and results are gathered to rank.source.
“In SPMD, master is one of workers.”
... is also scatter() from rank.source.
If pbd.mode = "spmd", the same copy of X is supposed to
exist in all processors, and original apply(), lapply(),
or sapply() is operated on part of
X. An allgather() or gather() call is required to
aggregate results manually.
If pbd.mode = "dist", the different X is supposed to
exists in all processors, i.e. ‘distinct or distributed’ X,
and original apply(), lapply(), or sapply() is operated
on the all X. An allgather() or gather() call is
required to aggregate results manually.
In SPMD, it is better to split data into pieces, and X is a local
matrix in all processors. Originally, apply() should be sufficient
in this case.
A list or matrix will be returned.
Wei-Chen Chen wccsnow@gmail.com, George Ostrouchov, Drew Schmidt, Pragneshkumar Patel, and Hao Yu.
Programming with Big Data in R Website: http://r-pbd.org/
## Not run:
### Save code in a file "demo.r" and run with 2 processors by
### SHELL> mpiexec -np 2 Rscript demo.r
### Initial.
suppressMessages(library(pbdMPI, quietly = TRUE))
init()
.comm.size <- comm.size()
.comm.rank <- comm.rank()
### Example for pbdApply.
N <- 100
x <- matrix((1:N) + N * .comm.rank, ncol = 10)
y <- pbdApply(x, 1, sum, pbd.mode = "mw")
comm.print(y)
y <- pbdApply(x, 1, sum, pbd.mode = "spmd")
comm.print(y)
y <- pbdApply(x, 1, sum, pbd.mode = "dist")
comm.print(y)
### Example for pbdApply for 3D array.
N <- 60
x <- array((1:N) + N * .comm.rank, c(3, 4, 5))
dimnames(x) <- list(lat = paste("lat", 1:3, sep = ""),
lon = paste("lon", 1:4, sep = ""),
time = paste("time", 1:5, sep = ""))
comm.print(x[,, 1:2])
y <- pbdApply(x, c(1, 2), sum, pbd.mode = "mw")
comm.print(y)
y <- pbdApply(x, c(1, 2), sum, pbd.mode = "spmd")
comm.print(y)
y <- pbdApply(x, c(1, 2), sum, pbd.mode = "dist")
comm.print(y)
### Example for pbdLapply.
N <- 100
x <- split((1:N) + N * .comm.rank, rep(1:10, each = 10))
y <- pbdLapply(x, sum, pbd.mode = "mw")
comm.print(unlist(y))
y <- pbdLapply(x, sum, pbd.mode = "spmd")
comm.print(unlist(y))
y <- pbdLapply(x, sum, pbd.mode = "dist")
comm.print(unlist(y))
### Finish.
finalize()
## End(Not run)