[R] Performance & capacity characteristics of R?

Ross Ihaka ihaka at stat.auckland.ac.nz
Tue Aug 3 10:14:47 CEST 1999

On Tue, Aug 03, 1999 at 06:57:38AM +0000, Karsten M. Self wrote:
> I hope this is merely a FAQ, and not an AFAQ (annoyingly....).
> I'm a SAS programmer, with several years' experience of the system,
> evaluating alternatives.  See the SAS for Linux website (URL in sig) for
> more info.
> I'm exploring R's capabilities and limitations.  I'd be very interested
> in having a deeper understanding of it capacity and performance
> limitations in dealing with very large datasets, which I would classify
> as tables with 1 million to 100s of millions of rows and two - 100+
> fields (variables) generally of 8 bytes -- call it a 16 - 800 byte
> record length.
> Can R handle such large datasets (tables)?  What are the general
> parameters for memory requirements?  How great a performance hit does
> running to swap (virtual memory) entail?  What common
> procedures|functions under R use significantly more memory?  Are there
> guidelines or documentation which point to issues and parameters of
> large file|dataset processing under R?

R is not intended for data sets of the size you describe.  It is
indended to handle data sets of a few tens of megabytes at most.
Unlike sas it holds complete data sets in memory.

We are currently looking at the memory management and performance
issues, but the scale of data processing you describe will need a
different kind of tool.  On the rare occasions I encounter problems of
the size you describe I usually do significant preprocessing with
something like perl.

Given that the internals of R are not yet in a final state I would
hestitate to make precise performance statements or recommendations.
We will be in a better position to do so when the 1.0 release comes

r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list