[R] Passing lists and R memory usage growth

Rajeev Ayyagari rajeev.ayyagari at gmail.com
Sat Oct 3 21:02:10 CEST 2009


Duncan:

I took your suggestion and upgraded to R 2.9.2, but the problem persists.

I am not able to reproduce the problem in a simple case.   In my
actual code the functions some.function.1() and some.function.2() are
quite complicated and call various other functions which also access
elements of the list.  If I can find a simple way to reproduce it, I
will post the code to the list.

I know it must be the "results" list in the pseudocode which is
causing the problem because:

1. I tried tracemem() on par and results; results is duplicated
several times but par is not.

2. I can eliminate the memory problem completely by rewriting
some.function.1() and some.function.2() to accept individual elements
of the list as arguments, and passing several list elements like
results$gamma[[iter-1]] etc. in the call. (Rather than passing the
entire list as a single argument.)  This makes the code harder to read
but the memory problem is eliminated.

Regards
Rajeev.

On Sat, Oct 3, 2009 at 1:43 PM, Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
> You need to give reproducible code for a question like this, not pseudocode.
>
> And you should consider using a recent version of R, not the relatively
> ancient 2.8.1 (which was released in late 2008.
>
> Duncan Murdoch
>
> On 03/10/2009 1:30 PM, Rajeev Ayyagari wrote:
>>
>> Hello,
>>
>> I can't think of an explanation for this memory allocation behaviour
>> and was hoping someone on the list could help out.
>>
>> Setup:
>> ------
>>
>> R version 2.8.1, 32-bit Ubuntu 9.04 Linux, Core 2 Duo with 3GB ram
>>
>> Description:
>> ------------
>>
>> Inside a for loop, I am passing a list to a function.  The function
>> accesses various members of the list.
>>
>> I understand that in this situation, the entire list may be duplicated
>> in each function call.  That's ok.  But the memory given to these
>> duplicates doesn't seem to be recovered by the garbage collector after
>> the function call has ended and more memory is allocated in each
>> iteration. (See output below.)
>>
>> I also tried summing up object.size() for all objects in all
>> environments, and the total is constant about 15 Mbytes at each
>> iteration.  But overall memory consumption as reported by gc() (and my
>> operating system) keeps going up to 2 Gbytes and more.
>>
>> Pseudocode:
>> -----------
>>
>> # This function and its callees need a 'results' list
>> some.function.1 <- function(iter, res, par)
>> {
>>  # access res$gamma[[iter-1]], res$beta[[iter-1]]
>>  ...
>> }
>>
>> # This function and its callees need a 'results' list
>> some.function.2 <- function(iter, res, par)
>> {
>>  # access res$gamma[[iter-1]], res$beta[[iter-1]]
>>  ...
>> }
>>
>> # Some parameters
>> par <- list( ... )
>>
>> # List storing results.
>> # Only results$gamma[1:3], results$beta[1:3] are used
>> results <- list(gamma = list(), beta = list())
>>
>> for (iter in 1:100)
>> {
>>  print(paste("Iteration ", iter))
>>
>>  # min(iter, 3) is the most recent slot of results$gamma etc.
>>  results$gamma[[min(iter, 3)]] <- some.function.1(min(iter, 3), results,
>> par)
>>  results$beta[[min(iter, 3)]] <- some.function.2(min(iter, 3), results,
>> par)
>>
>>  # Delete earlier results
>>  if (iter > 2)
>>  {
>>    results$gamma[[1]] <- NULL
>>    results$beta[[1]] <- NULL
>>  }
>>
>>  # Report on memory usage
>>  gc(verbose=TRUE)
>> }
>>
>> Output from an actual run of my program:
>> ----------------------------------------
>>
>> [1] "Iteration  1"
>> Garbage collection 255 = 122+60+73 (level 2) ...
>> 6.1 Mbytes of cons cells used (48%)
>> 232.3 Mbytes of vectors used (69%)
>> [1] "Iteration  2"
>> Garbage collection 257 = 123+60+74 (level 2) ...
>> 6.1 Mbytes of cons cells used (48%)
>> 238.3 Mbytes of vectors used (67%)
>> [1] "Iteration  3"
>> Garbage collection 258 = 123+60+75 (level 2) ...
>> 6.1 Mbytes of cons cells used (49%)
>> 242.8 Mbytes of vectors used (69%)
>> [1] "Iteration  4"
>> Garbage collection 259 = 123+60+76 (level 2) ...
>> 6.2 Mbytes of cons cells used (49%)
>> 247.3 Mbytes of vectors used (66%)
>> [1] "Iteration  5"
>> Garbage collection 260 = 123+60+77 (level 2) ...
>> 6.2 Mbytes of cons cells used (50%)
>> 251.8 Mbytes of vectors used (68%)
>> ...
>>
>> Thanks,
>> Rajeev.
>>
>>        [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>




More information about the R-help mailing list