[R] Ways to speed up R code?
Duncan Murdoch
murdoch at stats.uwo.ca
Tue Oct 18 11:45:02 CEST 2005
ecoinfo wrote:
> Hi R-users:
>
> Yesterday I ran a R code for 9 hours and it did not show any sign to
> stop. Then I interrupted it and found it had completed 82.5%.
>
> This morning I decided to wait for another 11 hours to see what will
> happen. Wait a minute, I heard that transforming data.frame to matrix
> will make R code faster. Then I made the modification in my R code.
> Oooh, the new code finished within 30 minutes!!
>
> Are there any other tips to speed up R program? Or someone could
> indicate me some documents or websites on R code optimization?
>
> #OS: Win XP, CPU: Pentium IV, 3.20G, Memory: 1G
> #for() loop: 1000*1616*3*41, 3 data.frames (dim = c(1616,5), c(1616),
> c(1616) respectively)
- As you found, indexing operations on matrices are much faster than on
dataframes.
- Avoid growing allocations: calculate the size you need, then allocate
it all at once.
- Vectorize calculations.
- Use Rprof() to identify where your code is spending its time, and
concentrate your efforts on that area. Perhaps translate some essential
routines into compiled C or Fortran.
- For a smaller improvement that might not suit your application,
convert factors to their numeric codes.
- Break up long calculations into smaller pieces, so you can write out
intermediate values. This doesn't necessarily speed it up, but it lets
you stop and restart the calculation. It may also make it more suited
to running on a cluster of computers instead of just one.
- Limit your use of memory so you don't end up using a swap file. Do
this by only keeping objects that will be used later, removing others.
(With the size of objects you were working with this may not be an issue.)
Duncan Murdoch
More information about the R-help
mailing list