[R] Odp: formatting do.call output

Petr PIKAL petr.pikal at precheza.cz
Fri Feb 12 13:42:22 CET 2010


Hi

r-help-bounces at r-project.org napsal dne 11.02.2010 20:36:31:

> 
> Hello, 
> 
> With a dataset that is about 1.4e6 rows long
> my.dat[seq(1:20),]
>          date Cell_ID Classification WT_Depth Frac_ET_Satsfd
>  1999-04-08     974              3 3.585083    0.244561400
>  1999-04-08    1188              3 3.526001    0.123484700
>  1999-04-08    1189              3 3.187012    0.215916700
>  1999-04-08    1403              3 3.395020    0.163972900
>  1999-04-08    1409              3 2.368042    0.415165500
>  1999-04-08    1617              3 4.039917    0.003224544
>  1999-04-08    1618              3 3.457886    0.148585900
>  1999-04-08    1619              3 2.148926    0.475924300
>  1999-04-08    1620              2 1.523926    0.633190900
>  1999-04-08    1621              3 2.197998    0.469294300
>  1999-04-15    1622              7 2.759033    0.325698400
>  1999-04-15    1623              3 2.802002    0.313719600
>  1999-04-15    1624              3 3.062988    0.243275900
>  1999-04-15    1833              2 3.483032    0.141840700
>  1999-04-15    1834              2 3.128052    0.232235400
>  1999-04-15    1835              7 3.354004    0.176209000
>  1999-04-15    1836              3 2.691040    0.341234700
>  1999-04-15    1837              3 3.140991    0.228083800
>  1999-04-15    1838              3 2.392944    0.413379300
>  1999-04-15    2048              2 3.712036    0.084534560
> .
> .
> .
> I use
> edm.func<-function(x){c(mu=mean(x),var=var(x))}
> 
output<-do.call("rbind",tapply(my.dat$Frac_ET_Satsfd,list(my.dat$date,my.dat
> $Classification),edm.func))
> data.frame(output)
>             mu        var
> 1      0.7980007 0.03446669
> 2      0.7947966 0.03429280
> 3      0.8240736 0.02482441
> .
> .
> 3573 0.4509044 0.03821251
> 3574 0.4484108 0.03856110
> 3575 0.4519150 0.03889944
> 
> There are 447 dates and 8 classifications (1-8).  What is the best way 
to
> include the corresponding date and classification that goes with each 
row?

I would use different approach to get summary values. One option is 
cbinding aggregate results

aggregate(my.dat$Frac_ET_Satsfd,list(my.dat$date,my.dat$Classification), 
mean)
aggregate(my.dat$Frac_ET_Satsfd,list(my.dat$date,my.dat$Classification), 
sd)

The other is to use by which accepts function returning several values.

Other possibility is sapply/split approach, which I believe is used in 
plyr library

output<- 
t(sapply(split(my.dat$Frac_ET_Satsfd,interaction(my.dat$date,my.dat$Classification),edm.func)))

Untested, not sure about parentheses.

Regards
Petr




> 
> Thanks Eric
> -- 
> View this message in context: 
http://n4.nabble.com/formatting-do-call-output-
> tp1477702p1477702.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list