global distance function {pbdMPI} | R Documentation |
These functions globally compute distance for all ranks.
comm.dist(X.gbd, method = "euclidean", diag = FALSE, upper = FALSE, p = 2, comm = .pbd_env$SPMD.CT$comm, return.type = c("common", "gbd"))
X.gbd |
a gbd matrix. |
method |
as in |
diag |
as in |
upper |
as in |
p |
as in |
comm |
a communicator number. |
return.type |
returning type for the distance. |
The distance function is implemented for a distributed matrix.
The return type common
is only useful when the number
of rows of the matrix is small since the returning matrix is N * N
for every rank where N
is the total number of rows of X.gbd
of all ranks.
The return type gbd
returns a gbd matrix (distributed across
all ranks, and the gbd matrix has 3 columns, named "i", "j", and "value",
where (i, j)
is the global indices of the
i-th and j-th rows of X.gbd
, and value
is the corresponding
distance. The (i, j)
is ordered as a distance matrix.
A full distance matrix is returned from the common
return type.
Suppose N.gbd
is total rows of X.gbd
, then
the distance will have N.gbd * (N.gbd - 1) / 2
elements
and the distance matrix will have N.gbd^2
elements.
A gbd distance matrix with 3 columns is returned from the
gbd
return type.
The distance or distance matrix could be huge.
Wei-Chen Chen wccsnow@gmail.com, George Ostrouchov, Drew Schmidt, Pragneshkumar Patel, and Hao Yu.
Programming with Big Data in R Website: http://r-pbd.org/
comm.allpairs()
and
comm.pairwise()
.
## Not run: ### Save code in a file "demo.r" and run with 2 processors by ### SHELL> mpiexec -np 2 Rscript demo.r ### Initial. suppressMessages(library(pbdMPI, quietly = TRUE)) init() ### Examples. comm.set.seed(123456, diff = TRUE) X.gbd <- matrix(runif(6), ncol = 3) dist.X.common <- comm.dist(X.gbd) dist.X.gbd <- comm.dist(X.gbd, return.type = "gbd") ### Verify. dist.X <- dist(do.call("rbind", allgather(X.gbd))) comm.print(all(dist.X == dist.X.common)) ### Verify 2. dist.X.df <- do.call("rbind", allgather(dist.X.gbd)) comm.print(all(dist.X == dist.X.df[, 3])) comm.print(dist.X) comm.print(dist.X.df) ### Finish. finalize() ## End(Not run)