[R] renaming objects

Prof Brian Ripley ripley at stats.ox.ac.uk
Tue Mar 4 08:06:01 CET 2008


On Mon, 3 Mar 2008, Nordlund, Dan (DSHS/RDA) wrote:

[..., quoting Hadley Wickham]

>>>> gc()
>>>          used (Mb) gc trigger (Mb) max used (Mb)
>>> Ncells 133095  3.6     350000  9.4   350000  9.4
>>> Vcells  87049  0.7     786432  6.0   478831  3.7
>>>> a <- runif(1e7)
>>>> gc()
>>>            used (Mb) gc trigger (Mb) max used (Mb)
>>> Ncells   133112  3.6     350000  9.4   350000  9.4
>>> Vcells 10087364 77.0   11458389 87.5 10087374 77.0
>>>> b <- a
>>>> gc()
>>>            used (Mb) gc trigger (Mb) max used (Mb)
>>> Ncells   133117  3.6     350000  9.4   350000  9.4
>>> Vcells 10087365 77.0   12111308 92.5 10087476 77.0
>>>
>>> R will only create a copy if either of a or b is modified.

> But, the OP should know that in the above scenario, if a or b is changed 
> the copy will be created, doubling the storage requirements.  Of course, 
> this can be prevented by removing vector a after the assignment.

Hadley was correct: it is not prevented by removing 'a', as R does not 
have reference counting.   E.g.

rm(a)
b[1] <- 1
gc()
            used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells   133947  3.6     350000   9.4   350000   9.4
Vcells 10087573 77.0   21337085 162.8 20087562 153.3

Note the 'max used' Vcells.

There's a fairly complete explanation of what happens in the 'R Internals' 
manual.

I think the most common source of confusion is over the term 'objects'.  R 
does not have 'objects' in this sense: 'a' and 'b' are symbols with 
bindings to values.  So you cannot change 'b', but you can change its 
binding.  When you do b[1] <- 1 you may create a new C-level structure as 
the new value, or you may change the existing one.  In this case it 
created a new structure (by copying the old one and altering that). 
Certain replacement functions are the only way to avoid making a new 
value: a <- a+0 for example always creates a new value (at a different 
address in memory) even though its contents will be identical.

Once we get away from the simplest vectors more sharing can be done: e.g. 
character vectors with duplicate elements will share storage for those 
elements.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list