[R] colSums in C

Peter Dalgaard BSA p.dalgaard at biostat.ku.dk
Tue Dec 18 01:32:29 CET 2001

David Brahm  <brahm at alum.mit.edu> writes:

> I asked how to write speedy C code to implement colSums().  My original version
> on a 400x40000 matrix took 5.72s.
> Peter Dalgaard <p.dalgaard at biostat.ku.dk> suggested some more efficient coding,
> which sped my example up to 3.90s.  Douglas Bates <bates at stat.wisc.edu>
> suggested using .Call() instead of .C, and I was amazed to see the time went
> down to 0.69s!  Doug had actually posted his code (a package called "MatUtils")
> to R-help on July 19, 2001.
> I've taken Doug's code, added names to the result, and included an na.rm flag.
> Unfortunately, my na.rm option makes it really slow again! (12.15s).  That's no
> faster than pre-processing the matrix with "m[is.na(m)] <- 0".  Can anyone help
> me understand why the ISNA conditional is taking so much time?  The C code is
> below.  Thanks!

>     if (narm) for (j = 0; j < p; j++) {
>       for (sum = 0., i = 0; i < n; i++) if (!ISNA(mm[i])) sum += mm[i];

ISNA maps to the *function* R_IsNA and function calls are expensive.
Also, you are probably breaking some pipelining with the extra
conditional. Just for testing, what happens if you use isnan()

We could potentially set things up so that the compiler gets a chance
to inline R_IsNA and friends, so I wonder how much we might gain.

   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list