[R] colSums in C

David Brahm brahm at alum.mit.edu
Tue Dec 18 00:57:48 CET 2001

I asked how to write speedy C code to implement colSums().  My original version
on a 400x40000 matrix took 5.72s.

Peter Dalgaard <p.dalgaard at biostat.ku.dk> suggested some more efficient coding,
which sped my example up to 3.90s.  Douglas Bates <bates at stat.wisc.edu>
suggested using .Call() instead of .C, and I was amazed to see the time went
down to 0.69s!  Doug had actually posted his code (a package called "MatUtils")
to R-help on July 19, 2001.

I've taken Doug's code, added names to the result, and included an na.rm flag.
Unfortunately, my na.rm option makes it really slow again! (12.15s).  That's no
faster than pre-processing the matrix with "m[is.na(m)] <- 0".  Can anyone help
me understand why the ISNA conditional is taking so much time?  The C code is
below.  Thanks!

---- begin "colSums.c" (90% due to Doug Bates) ----
#include <R.h>
#include <Rdefines.h>

SEXP colSums(SEXP m, SEXP NaRm) {
    int i, j, *mdims, n, p, narm;
    double *mm, sum;
    SEXP val, nms;

    if (!isMatrix(m)) error("m must be a matrix");
    mdims = INTEGER(GET_DIM(m));
    narm  = asLogical(NaRm);
    n = mdims[0]; p = mdims[1];
    PROTECT(val  = NEW_NUMERIC(p));
    PROTECT(  m  = AS_NUMERIC(m));
    mm = REAL(m);
    if (narm) for (j = 0; j < p; j++) {
      for (sum = 0., i = 0; i < n; i++) if (!ISNA(mm[i])) sum += mm[i];
      REAL(val)[j] = sum;
      mm += n;
    } else for (j = 0; j < p; j++) {
      for (sum = 0., i = 0; i < n; i++) sum += mm[i];
      REAL(val)[j] = sum;
      mm += n;
    namesgets(val, nms);
    return val;
---- end "colSums.c" ----

                              -- David Brahm (brahm at alum.mit.edu)
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list