[R] Better way to create tables of mean & standard deviations
Benjamin Dickgiesser
dickgiesser at gmail.com
Tue Nov 7 15:53:22 CET 2006
Thank you, that was exactly what I was looking for.
On 11/7/06, hadley wickham <h.wickham at gmail.com> wrote:
> > > I can only think of rather complex ways to solve the labeling issue...
> > >
> > > I would appreciate it if someone could point out if there are
> > > better/cleaner/easier ways of achieving what I'm trying todo.
> >
> > Does this help?
> >
> > g <- function(y) {
> > s <- apply(y, 2,
> > function(z) {
> > z <- z[!is.na(z)]
> > n <- length(z)
> > if(n==0) c(NA,NA,NA,0) else
> > if(n==1) c(z, NA,NA,1) else {
> > m <- mean(z)
> > s <- sd(z)
> > c(Mean=m, SD=s, N=n)
> > }
> > })
> > w <- as.vector(s)
> > names(w) <- as.vector(outer(rownames(s), colnames(s), paste, sep=''))
> > w
> > }
> >
> > df <- data.frame(LAB = rep(1:8, each=60), BATCH = rep(c(1,2), 240), Y =
> > rnorm(480))
> >
> > library(Hmisc)
> >
> > with(df, summarize(cbind(Y),
> > llist(LAB, BATCH),
> > FUN = g,
> > stat.name=c("mean", "stdev", "n")))
> >
> > LAB BATCH mean stdev n
> > 1 1 1 0.13467569 1.0623188 30
> > 2 1 2 0.15204232 1.0464287 30
> > 3 2 1 -0.14470044 0.7881942 30
> > 4 2 2 -0.34641739 0.9997924 30
> > 5 3 1 -0.17915298 0.9720036 30
> > 6 3 2 -0.13942702 0.8166447 30
> > 7 4 1 0.08761900 0.9046908 30
> > 8 4 2 0.27103640 0.7692970 30
> > 9 5 1 0.08017377 1.1537611 30
> > 10 5 2 0.01475674 1.0598336 30
> > 11 6 1 0.29208572 0.8006171 30
> > 12 6 2 0.10239509 1.1632274 30
> > 13 7 1 -0.35550603 1.2016190 30
> > 14 7 2 -0.33692452 1.0458184 30
> > 15 8 1 -0.03779253 1.0385098 30
> > 16 8 2 -0.18652758 1.1768540 30
> >
> > with(df, summarize(cbind(Y),
> > llist(LAB),
> > FUN = g,
> > stat.name=c("mean", "stdev", "n")))
> >
> > LAB mean stdev n
> > 1 1 0.14335900 1.0454666 60
> > 2 2 -0.24555892 0.8983465 60
> > 3 3 -0.15929000 0.8902766 60
> > 4 4 0.17932770 0.8377011 60
> > 5 5 0.04746526 1.0988603 60
> > 6 6 0.19724041 0.9946316 60
> > 7 7 -0.34621527 1.1168682 60
> > 8 8 -0.11216005 1.1029466 60
> >
> > Once you write the summary function g, it's not that complex. See
> > ?summarize in the Hmisc package for more detail. Also, you might take a
> > look at the doBy and reshape packages.
>
> With the reshape package, I'd do it like this:
>
> df <- data.frame(LAB = rep(1:8, each=60), BATCH = rep(c(1,2), 240), Y
> =rnorm(480))
> dfm <- melt(df, measured="Y")
>
> cast(dfm, LAB ~ ., c(mean, sd, length))
> cast(dfm, LAB + BATCH ~ ., c(mean, sd, length))
> cast(dfm, LAB + BATCH ~ ., c(mean, sd, length), margins=T)
>
> Regards,
>
> Hadley
>
More information about the R-help
mailing list