[R] Speeding up "accumulation" code in large matrix calc?

Fri Feb 24 22:57:25 CET 2012

On Fri, Feb 24, 2012 at 09:06:28PM +0100, Berend Hasselman wrote:
[...]
> Summary of results
> 
> > Nrow <- 100
> > Ncol <- 1000
> > A <- matrix((runif(Ncol*Nrow)<0.2)+0, nrow=Nrow)
> > print(rbind(z1,z2,z3,z4,z5,z6))
>    user.self sys.self elapsed user.child sys.child
> z1     0.543    0.005   0.551          0         0
> z2     0.299    0.004   0.304          0         0
> z3     0.070    0.012   0.082          0         0
> z4     0.112    0.002   0.114          0         0
> z5     0.263    0.007   0.271          0         0
> z6     0.062    0.009   0.070          0         0
[...]
> The original function (t1) benefits a lot from compiling to byte code.
> The function that operates on columns (t3) seems to be always the quickest in this experiment.

Hi.

Let me include one more solution operating on rows to the test.

  t4 <- function(A, outcome) {
      oneRow <- function(x, outcome)
      {
          n <- length(x)
          y <- 1:n
          z <- y
          z[c(FALSE, x[-n] != outcome)] <- 0
          y - cummax(z)
      }
      t(apply(A, 1, oneRow, outcome=outcome))
  }

  library(compiler)
  t1.c <- cmpfun(t1)
  t3.c <- cmpfun(t3)
  t4.c <- cmpfun(t4)

  Nrow <- 100
  Ncol <- 1000
  A <- matrix((runif(Ncol*Nrow)<0.2)+0, nrow=Nrow)

  library(rbenchmark)
  benchmark(t1(A,outcome=1),
            t3(A,outcome=1),
            t4(A,outcome=1),
            t1.c(A,outcome=1),
            t3.c(A,outcome=1),
            t4.c(A,outcome=1),
            columns=c("test", "user.self", "sys.self", "relative"),
            replications=1)

                    test user.self sys.self relative
  1   t1(A, outcome = 1)     0.744        0  46.5000
  4 t1.c(A, outcome = 1)     0.284        0  17.6875
  2   t3(A, outcome = 1)     0.076        0   4.7500
  5 t3.c(A, outcome = 1)     0.072        0   4.6250
  3   t4(A, outcome = 1)     0.016        0   1.0000
  6 t4.c(A, outcome = 1)     0.020        0   1.1250

Here, t4(), t4.c() is faster than t3(), t3.c().

With

  Nrow <- 20000
  Ncol <- 50
  A <- matrix((runif(Ncol*Nrow)<0.2)+0, nrow=Nrow)

i get

                    test user.self sys.self  relative
  1   t1(A, outcome = 1)     7.444        0 20.233696
  4 t1.c(A, outcome = 1)     2.740        0  7.442935
  2   t3(A, outcome = 1)     0.368        0  1.000000
  5 t3.c(A, outcome = 1)     0.368        0  1.002717
  3   t4(A, outcome = 1)     0.592        0  1.605978
  6 t4.c(A, outcome = 1)     0.536        0  1.456522

Here, t3() and t3.c() are faster than t4(), t4.c().

Petr.