[R] aggregate vs tapply; is there a middle ground?
hadley wickham
h.wickham at gmail.com
Sat Feb 11 23:44:53 CET 2006
> I faced a similar problem. Here's what I did
>
> tmp <-
> data.frame(A=sample(LETTERS[1:5],10,replace=T),B=sample(letters[1:5],10,replace=T),C=rnorm(10))
> tmp1 <- with(tmp,aggregate(C,list(A=A,B=B),sum))
> tmp2 <- expand.grid(A=sort(unique(tmp$A)),B=sort(unique(tmp$B)))
> merge(tmp2,tmp1,all.x=T)
>
> At least fewer than 10 extra lines of code. Anyone with a simpler solution?
Well, you can almost do this in with the reshape package:
tmp <-
data.frame(A=sample(LETTERS[1:5],10,replace=T),B=sample(letters[1:5],10,replace=T),C=rnorm(10))
a <- recast(tmp, A + B ~ ., sum)
# see also recast(tmp, A ~ B, sum)
add.all.combinations(a, row="A", cols = "B")
Where add.all.combinations basically does what you outlined above --
it would be easy enough to generalise to multiple dimensions.
Hadley
More information about the R-help
mailing list