[R] any way to make the code more efficient ?
Charles C. Berry
cberry at tajo.ucsd.edu
Fri Dec 8 23:52:55 CET 2006
Save your intermediate results as a list of matrices.
Then rbind them all at once using do.call.
It looks like this will save 23 seconds (see below), if you are running on
a PC like mine (AMD 2GHz, WinXP ).
But I wonder, if 23 a mere seconds is all you save is this really worth
worrying about??
Maybe you are losing time elsewhere.
If so, you need to profile this run and/or track memory usage.
> amat <- NULL
> mat.1400.by.4 <- matrix(1:(1400*4),nc=4)
> system.time(for (i in 1:500) amat <- rbind(amat, mat.1400.by.4 ))
[1] 20.05 1.53 23.24 NA NA
>
> list.of.matrices <- rep( list( mat.1400.by.4 ) , 500 )
> system.time( amat2 <- do.call(rbind, list.of.matrices ) )
[1] 0.08 0.00 0.08 NA NA
> all.equal(amat,amat2)
[1] TRUE
>
On Fri, 8 Dec 2006, Leeds, Mark (IED) wrote:
> The code bekow works so this is why I didn't include the data to
> reproduce it. The loops about 500
> times and each time, a zoo object with 1400 rows and 4 columns gets
> created. ( the rows represent minutes so each file is one day
> worth of data). Inside the loop, I keep rbinding the newly created zoo
> object to the current zoo object so that it gets bigger and
> bigger over time.
>
> Eventually, the new zoo object, fullaggfxdata, containing all the days
> of data is created.
>
> I was just wondering if there is a more efficient way of doing this. I
> do know the number of times the loop will be done at the beginning so
> maybe creating the a matrix or data frame at the beginning and putting
> the daily ones in something like that would
> Make it be faster. But, the proboem with this is I eventually do need a
> zoo object. I ask this question because at around the 250
> mark of the loop, things start to slow down significiantly and I think I
> remember reading somewhere that doing an rbind of something to itself is
> not a good idea. Thanks.
>
> #=======================================================================
> ===============================================
>
> start<-1
>
> for (filecounter in (1:length(datafilenames))) {
>
> print(paste("File Counter = ", filecounter))
> datafile= paste(datadir,"/",datafilenames[filecounter],sep="")
> aggfxdata<-clnaggcompcurrencyfile(fxfile=datafile,aggminutes=aggminutes,
> fillholes=1)
> logbidask<-log(aggfxdata[,"bidask"])
> aggfxdata<-cbind(aggfxdata,logbidask)
>
> if ( start == 1 ) {
> fullaggfxdata<-aggfxdata
> start<-0
> } else {
> fullaggfxdata<-rbind(fullaggfxdata,aggfxdata)
> }
>
>
> }
>
> #=======================================================================
> ==================================
> --------------------------------------------------------
>
> This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Charles C. Berry (858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu UC San Diego
http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0717
More information about the R-help
mailing list