[R] Efficient computation of trimmed stats?

Benilton Carvalho bcarvalh at jhsph.edu
Mon May 14 18:58:42 CEST 2007

Hi everyone,

I was wondering if there is anything already implemented for  
efficient ("row-wise") computation of group-specific trimmed stats  
(mean and sd on the trimmed vector) on large matrices.

For example:

nc = 300
nr = 250000
x = matrix(rnorm(nc*nr), ncol=nc)
g = matrix(sample(1:3, nr*nc, rep=T), ncol=nc)

trimmedMeanByGroup <- function(y, grp, trim=.05)
   tapply(y, factor(grp, levels=1:3), mean, trim=trim)

sapply(1:10, function(i) trimmedMeanByGroup(x[i,], g[i,]))

works fine... but:

 > system.time(sapply(1:nr, function(i) trimmedMeanByGroup(x[i,], g 
    user  system elapsed
399.928   0.019 399.988

does not look interesting for me.

Maybe some package has some implementation of the above?

Thank you very much,

Benilton Carvalho
PhD Candidate
Department of Biostatistics
Bloomberg School of Public Health
Johns Hopkins University
bcarvalh at jhsph.edu

More information about the R-help mailing list