## Overview
The "Programming with Big Data in [R](http://www.r-project.org/)"
project (pbdR) is a set of [highly scalable
R](https://www.hpcwire.com/2016/07/06/olcf-researchers-scale-r-tackle-big-science-data-sets/)
packages for distributed computing and profiling in data science.
Our packages include high performance, high-level interfaces to
[MPI](https://en.wikipedia.org/wiki/Message_Passing_Interface),
[ZeroMQ](http://zeromq.org/),
[ScaLAPACK](https://en.wikipedia.org/wiki/ScaLAPACK),
[NetCDF4](https://en.wikipedia.org/wiki/NetCDF),
[PAPI](http://icl.utk.edu/papi/), and more. While
these libraries [shine brightest on large distributed
systems](https://www.olcf.ornl.gov/2016/07/05/olcf-expands-data-analytics-capability-with-popular-programming-language/),
they also [work rather well on small
clusters](https://www.vldb.org/pvldb/vol11/p2168-thomas.pdf#search="pbdR") and
usually, surprisingly, even on a laptop with only two cores.
Winner of the Oak Ridge National Laboratory 2016 Significant Event
Award for "Harnessing HPC Capability at OLCF with the R Language for
Deep Data Science." OLCF is the Oak Ridge Leadership Computing
Facility, which currently includes Summit, the second most powerful computer
system in the world.
## Contact
* **Discussion group**: [RBigDataProgramming](https://groups.google.com/forum/?fromgroups#!forum/rbigdataprogramming)
(preferred)
* **Email**: RBigData AT gmail
## Authors and Citation
* [Wei-Chen Chen](http://snoweye.github.io/)
* [George Ostrouchov](http://www.csm.ornl.gov/%7Eost/)
* Pragneshkumar Patel
* [Drew Schmidt](http://wrathematics.github.io/)
Please cite individual packages used as well as this web page:
```
@ONLINE{
pbdR2012,
author = {Ostrouchov, G. and Chen, W.-C. and Schmidt, D. and Patel, P.},
title = {Programming with Big Data in R},
year = {2012},
organization = {Oak Ridge National Laboratory and University of Tennessee},
url = {http://r-pbd.org/}
}
## Cite individual packages by running:
citation("package")
```
## Funding
This project, including software, documentation, talks, and tutorials,
is/has been supported in part by the following:
- Oak Ridge Leadership Computing
Facility, which is a DOE Office of Science User Facility supported
under Contract DE-AC05-00OR22725.
- Division of Mathematical Sciences, National Science Foundation,
Award No.
[1418195](http://www.nsf.gov/awardsearch/showAward?AWD_ID=1418195),
2014-2019.
- The National Institute for Mathematical and Biological Synthesis,
under Award No. EF-0832858 and DBI-1300426, 2013-2014.
- The Division of Molecular and Cellular Biosciences, National Science
Foundation Award MCB-1120370, 2013-2014.
- The Office of Cyberinfrastructure of the U.S. National Science
Foundation under Award No. ARRA-NSF-OCI-0906324 for NICS-RDAV
center, 2012-2013.
- U.S. Department of Energy Office of Science under Contract No.
DE-AC05-00OR22725, 2011-2013.
## Acknowledgements
We thank everyone who has submitted a bug report for the pbdR project.
We also thank the members of the [CRAN](http://cran.r-project.org/) for
their help and suggestions with pbdR packages, as well as their tireless
efforts to develop and support R and its extensions.