[R] Advice on large data structures

Alex Ruiz Euler rruizeuler at ucsd.edu
Fri Sep 2 19:29:02 CEST 2011


Along the lines of one of Jim's suggestions, if you have some
basic MySQL knowledge check out the RMySQL package. I use it to
convert / partition a matrix similar to yours to R objects and it
works fine.

Hope this helps,
A.


On Fri, 2 Sep 2011 06:33:13 -0400
Jim Holtman <jholtman at gmail.com> wrote:

> i would suggest that if you want to use R that you get a 64-bit version with 24GB of memory to start.  if your data is a numeric matrix, you will need 8GB for a single copy.
> 
> Do you really need it all in memory at once, or can you partition the problem?  Can you use a database to access the portion you need at any time?
> 
> If you only need one, or two, columns at a time, then the use of a database storing the columns might work.  You probably need some more analysis on exactly how you want to solve your problem understanding the limitations of the system.
> 
> Sent from my iPad
> 
> On Sep 2, 2011, at 1:13, Worik R <worikr at gmail.com> wrote:
> 
> > Friends
> > 
> > I am starting on a (section of the) project where I need to build a matrix
> > with on the order of 5 million rows and 200 columns
> > 
> > I am wondering if I can stay in R.
> > 
> > I need to do rollapply type operations on the columns, including some that
> > will be functions of (windows of) two columns.
> > 
> > I have been looking at the ff and bigmemory packages but am not sure that
> > they will do.
> > 
> > Before I get too deep can some one offer some wisdom about what the best
> > direction to go would be?
> > 
> > Switching to C/C++ is definitely an option if it is all too hard
> > 
> > cheers
> > Worik
> > 
> >    [[alternative HTML version deleted]]
> > 
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list