[R] matrix row product and cumulative product

Jeff Laake Jeff.Laake at noaa.gov
Mon Aug 18 19:14:49 CEST 2008


Sorry a correction to my last posting.  I had accumulate switched 
between prod and cumprod and I had also forgotten to included time for 
conversion from list back to matrix for cumprod.  Now as Chuck stated 
the results for Reduce are about the same or worse than a loop.  

regards--jeff

            Columns     10                 100         1000     10000 
    100000
              Rows         1000000     100000     10000     1000     100
cumprod Loop         1.0                 1.1             1.3         1.2 
    3.0
            Apply         26.4                 3.4             1.8       
  1.3     1.4
            Reduce         1.1                 0.9             1.2       
  1.2     5.3
prod     Loop             0.3                 0.3             0.4       
  0.4     0.9
            Apply          30.0                 2.7             0.7     
    0.5     0.8
            Reduce           0.7                0.6             0.7     
    0.8     3.1


N=10000000
xmat=matrix(runif(N),ncol=10)
system.time(cumprod.matrix(xmat))
system.time(t(apply(xmat,1,cumprod)))
system.time(do.call("cbind",Reduce("*",as.data.frame(xmat),accumulate=TRUE)))
system.time(prod.matrix(xmat))
system.time(apply(xmat,1,prod))
system.time(Reduce("*",as.data.frame(xmat),accumulate=FALSE))

xmat=matrix(runif(N),ncol=100)
system.time(cumprod.matrix(xmat))
system.time(t(apply(xmat,1,cumprod)))
system.time(do.call("cbind",Reduce("*",as.data.frame(xmat),accumulate=TRUE)))
system.time(prod.matrix(xmat))
system.time(apply(xmat,1,prod))
system.time(Reduce("*",as.data.frame(xmat),accumulate=FALSE))

xmat=matrix(runif(N),ncol=1000)
system.time(cumprod.matrix(xmat))
system.time(t(apply(xmat,1,cumprod)))
system.time(do.call("cbind",Reduce("*",as.data.frame(xmat),accumulate=TRUE)))
system.time(prod.matrix(xmat))
system.time(apply(xmat,1,prod))
system.time(Reduce("*",as.data.frame(xmat),accumulate=FALSE))

xmat=matrix(runif(N),ncol=10000)
system.time(cumprod.matrix(xmat))
system.time(t(apply(xmat,1,cumprod)))
system.time(do.call("cbind",Reduce("*",as.data.frame(xmat),accumulate=TRUE)))
system.time(prod.matrix(xmat))
system.time(apply(xmat,1,prod))
system.time(Reduce("*",as.data.frame(xmat),accumulate=FALSE))

xmat=matrix(runif(N),ncol=100000)
system.time(cumprod.matrix(xmat))
system.time(t(apply(xmat,1,cumprod)))
system.time(do.call("cbind",Reduce("*",as.data.frame(xmat),accumulate=TRUE)))
system.time(prod.matrix(xmat))
system.time(apply(xmat,1,prod))
system.time(Reduce("*",as.data.frame(xmat),accumulate=FALSE))


Jeff Laake wrote:
> Thanks for the tips on inline, jit and Reduce.  The latter was exactly 
> what I wanted although the loop
> is still the fastest for the simple product (accumulate=TRUE for 
> reduce).  With regards to Moshe's comment,
> I was just surprised by the timing difference.  I tend to use apply 
> without giving it much thought. After profiling the code
> it became apparent that a loop was better in this case.  I was just 
> surprised that a loop was still as good
> when the columns were 10 times the rows.
>
> I'm very intrigued by the inline package but couldn't find any 
> documentation on the compiler I need with
> a Windows machine to make it work.  Any hints would be very much 
> appreciated especially in regards to
> FORTRAN which was my first language some 35 years ago.  I have MS 
> FORTRAN 90 although I've not touched
> it for over 6 years thanks to the developers of R.
> Thanks much for the help.  --jeff
>
>
> Elapsed times from system.time.  see code below
>
>     Columns     10     100     1000     10000     100000
>
>     Rows     1000000     100000     10000     1000     100
> cumprod     Loop     1.0     1.0     1.3     1.2     3.0
>
>     Apply     27.3     3.4     1.8     1.2     1.4
>
>     Reduce     0.5     0.7     0.7     0.9     3.2
> prod     Loop     0.3     0.3     0.4     0.4     0.9
>
>     Apply     30.0     2.7     0.7     0.6     0.8
>
>     Reduce     0.6     0.6     0.8     1.0     4.7
>
>
> N=10000000
> xmat=matrix(runif(N),ncol=10)
> system.time(cumprod.matrix(xmat))
> system.time(t(apply(xmat,1,cumprod)))
> system.time(Reduce("*",as.data.frame(xmat),accumulate=FALSE))
> system.time(prod.matrix(xmat))
> system.time(apply(xmat,1,prod))
> system.time(Reduce("*",as.data.frame(xmat),accumulate=TRUE))
>
> xmat=matrix(runif(N),ncol=100)
> system.time(cumprod.matrix(xmat))
> system.time(t(apply(xmat,1,cumprod)))
> system.time(Reduce("*",as.data.frame(xmat),accumulate=FALSE))
> system.time(prod.matrix(xmat))
> system.time(apply(xmat,1,prod))
> system.time(Reduce("*",as.data.frame(xmat),accumulate=TRUE))
>
> xmat=matrix(runif(N),ncol=1000)
> system.time(cumprod.matrix(xmat))
> system.time(t(apply(xmat,1,cumprod)))
> system.time(Reduce("*",as.data.frame(xmat),accumulate=FALSE))
> system.time(prod.matrix(xmat))
> system.time(apply(xmat,1,prod))
> system.time(Reduce("*",as.data.frame(xmat),accumulate=TRUE))
>
> xmat=matrix(runif(N),ncol=10000)
> system.time(cumprod.matrix(xmat))
> system.time(t(apply(xmat,1,cumprod)))
> system.time(Reduce("*",as.data.frame(xmat),accumulate=FALSE))
> system.time(prod.matrix(xmat))
> system.time(apply(xmat,1,prod))
> system.time(Reduce("*",as.data.frame(xmat),accumulate=TRUE))
>
> xmat=matrix(runif(N),ncol=100000)
> system.time(cumprod.matrix(xmat))
> system.time(t(apply(xmat,1,cumprod)))
> system.time(Reduce("*",as.data.frame(xmat),accumulate=FALSE))
> system.time(prod.matrix(xmat))
> system.time(apply(xmat,1,prod))
> system.time(Reduce("*",as.data.frame(xmat),accumulate=TRUE))
>
> Charles C. Berry wrote:
>> On Sun, 17 Aug 2008, Jeff Laake wrote:
>>
>>> I spent a lot of time searching and came up empty handed on the 
>>> following query. Is there an equivalent to rowSums that does product 
>>> or cumulative product and avoids use of apply or looping? I found a 
>>> rowProd in a package but it was a convenience function for apply. As 
>>> part of a likelihood calculation called from optim, I’m computing 
>>> products and cumulative products of rows of matrices with far more 
>>> rows than columns. I started with apply and after some thought 
>>> realized that a loop of columns might be faster and it was 
>>> substantially faster (see below). Because the likelihood function is 
>>> called many times I’d like to speed it up even more if possible.
>>
>>
>> You might check out the 'inline' or 'jit' packages.
>>
>> Otherwise, if you can as easily treat xmat as a list (or data.frame),
>>
>>     Reduce( "*", xmat.data.frame, accumulate=want.cumprod )
>>
>> (where want.cumprod is FALSE for product, TRUE for cumulative 
>> product) will be a bit faster in many circumstances. However, this 
>> advantage is lost if you must retain xmat as a matrix since 
>> converting it to a data.frame seems to require more time than you save.
>>
>> HTH,
>>
>> Chuck
>>
>>>
>>> Below is an example showing the cumprod.matrix and prod.matrix 
>>> looping functions that I wrote and some timing comparisons to the 
>>> use of apply for different column and row dimensions. At this point 
>>> I’m better off with looping but I’d like to hear of any further 
>>> suggestions.
>>>
>>> Thanks –jeff
>>>
>>>>  prod.matrix=function(x)
>>> + {
>>> + y=x[,1]
>>> + for(i in 2:dim(x)[2])
>>> + y=y*x[,i]
>>> + return(y)
>>> + }
>>>
>>>>  cumprod.matrix=function(x)
>>> + {
>>> + y=matrix(1,nrow=dim(x)[1],ncol=dim(x)[2])
>>> + y[,1]=x[,1]
>>> + for (i in 2:dim(x)[2])
>>> + y[,i]=y[,i-1]*x[,i]
>>> + return(y)
>>> + }
>>>
>>>>  N=10000000
>>>>  xmat=matrix(runif(N),ncol=10)
>>>>  system.time(cumprod.matrix(xmat))
>>> user system elapsed
>>> 1.07 0.09 1.15
>>>>  system.time(t(apply(xmat,1,cumprod)))
>>> user system elapsed
>>> 29.27 0.21 29.50
>>>>  system.time(prod.matrix(xmat))
>>> user system elapsed
>>> 0.29 0.00 0.30
>>>>  system.time(apply(xmat,1,prod))
>>> user system elapsed
>>> 30.69 0.00 30.72
>>>>  xmat=matrix(runif(N),ncol=100)
>>>>  system.time(cumprod.matrix(xmat))
>>> user system elapsed
>>> 1.05 0.13 1.18
>>>>  system.time(t(apply(xmat,1,cumprod)))
>>> user system elapsed
>>> 3.55 0.14 3.70
>>>>  system.time(prod.matrix(xmat))
>>> user system elapsed
>>> 0.38 0.01 0.39
>>>>  system.time(apply(xmat,1,prod))
>>> user system elapsed
>>> 2.87 0.00 2.89
>>>>  xmat=matrix(runif(N),ncol=1000)
>>>>  system.time(cumprod.matrix(xmat))
>>> user system elapsed
>>> 1.30 0.18 1.46
>>>>  system.time(t(apply(xmat,1,cumprod)))
>>> user system elapsed
>>> 1.77 0.27 2.05
>>>>  system.time(prod.matrix(xmat))
>>> user system elapsed
>>> 0.46 0.00 0.47
>>>>  system.time(apply(xmat,1,prod))
>>> user system elapsed
>>> 0.7 0.0 0.7
>>>>  xmat=matrix(runif(N),ncol=10000)
>>>>  system.time(cumprod.matrix(xmat))
>>> user system elapsed
>>> 1.28 0.00 1.29
>>>>  system.time(t(apply(xmat,1,cumprod)))
>>> user system elapsed
>>> 1.19 0.08 1.26
>>>>  system.time(prod.matrix(xmat))
>>> user system elapsed
>>> 0.40 0.00 0.41
>>>>  system.time(apply(xmat,1,prod))
>>> user system elapsed
>>> 0.57 0.00 0.56
>>>>  xmat=matrix(runif(N),ncol=100000)
>>>>  system.time(cumprod.matrix(xmat))
>>> user system elapsed
>>> 3.18 0.00 3.19
>>>>  system.time(t(apply(xmat,1,cumprod)))
>>> user system elapsed
>>> 1.42 0.21 1.64
>>>>  system.time(prod.matrix(xmat))
>>> user system elapsed
>>> 1.05 0.00 1.05
>>>>  system.time(apply(xmat,1,prod))
>>> user system elapsed
>>> 0.82 0.00 0.81
>>>>  R.Version()
>>> $platform
>>> [1] "i386-pc-mingw32"
>>> .
>>> .
>>> .
>>> $version.string
>>> [1] "R version 2.7.1 (2008-06-23)"
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide 
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> Charles C. Berry                            (858) 534-2098
>>                                             Dept of Family/Preventive 
>> Medicine
>> E mailto:cberry at tajo.ucsd.edu                UC San Diego
>> http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 
>> 92093-0901
>>
>> ------------------------------------------------------------------------
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list