[R] More efficient option to append()?
Daniel Nordlund
djnordlund at frontier.com
Thu Aug 18 01:35:48 CEST 2011
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
> On Behalf Of Alex Ruiz Euler
> Sent: Wednesday, August 17, 2011 3:54 PM
> To: r-help at r-project.org
> Subject: [R] More efficient option to append()?
>
>
> Dear R community,
>
> I have a 2 million by 2 matrix that looks like this:
>
> x<-sample(1:15,2000000, replace=T)
> y<-sample(1:10*1000, 2000000, replace=T)
> x y
> [1,] 10 4000
> [2,] 3 1000
> [3,] 3 4000
> [4,] 8 6000
> [5,] 2 9000
> [6,] 3 8000
> [7,] 2 10000
> (...)
>
>
> The first column is a population expansion factor for the number in the
> second column (household income). I want to expand the second column
> with the first so that I end up with a vector beginning with 10
> observations of 4000, then 3 observations of 1000 and so on. In my mind
> the natural approach would be to create a NULL vector and append the
> expansions:
>
> myvar<-NULL
> myvar<-append(myvar, replicate(x[1],y[1]), 1)
>
> for (i in 2:length(x)) {
> myvar<-append(myvar,replicate(x[i],y[i]),sum(x[1:i])+1)
> }
>
> to end with a vector of sum(x), which in my real database corresponds
> to 22 million observations.
>
> This works fine --if I only run it for the first, say, 1000
> observations. If I try to perform this on all 2 million observations
> it takes long, way too long for this to be useful (I left it running
> 11 hours yesterday to no avail).
>
>
> I know R performs well with operations on relatively large vectors. Why
> is this so inefficient? And what would be the smart way to do this?
>
> Thanks in advance.
> Alex
>
Alex,
does the following do what you want?
myvar <- rep(y,x)
Hope this is helpful,
Dan
Daniel Nordlund
Bothell, WA USA
More information about the R-help
mailing list