[R] boxplot with average instead of median
Frank E Harrell Jr
f.harrell at vanderbilt.edu
Tue Aug 5 15:31:43 CEST 2008
Another option is to modify panel.bpplot in the Hmisc package and specify
library(lattice)
bwplot(..., panel=mypanel)
Note that panel.bpplot will show the mean. It shows more quantiles than
a standard box plot so you get more than a 3-number summary.
If you show the mean and standard deviation you are assuming much
(especially symmetry) about the distribution you are trying to show.
Frank
S Ellison wrote:
> boxplot itself is hardwired to produce the boxplot.stats list, and that
> is not easy to change.
>
> To get a different set of stats, you would need to do things in rwo
> stages:
> i) create a boxplot object of the type returned by boxplot, but using
> your own stats
> ii) call bxp on that object.
>
> That's kind of tricky.
>
> One comparatively simple alternative is to use the lattice package's
> bwplot, and specify an alternate function for the stats parameter. You
> have to write the alternate function, though. Here's one that would
> probably do something like what you want; it is intended to deliver
> boxes set to mean +-sd, outliers marked outside mean+-2.5sd by default
> and whiskers set to the outermost of mean+-sd or outermost non-outlier
> data.
>
> Not that I'd recommend it, but it's entertaining writing it. With a bit
> more wrapping, it could be used to generate a bxp-like object as well,
> as per uwe's suggestion.
>
> boxplot.norm<-function(x, do.conf=T, coef=1.5, do.out=T, p=0.05) {
> xx <- x[!is.na(x)]
> n <- length(xx)
> s<-sd(xx)
> m<-mean(xx)
> stats <- c(min(xx), m-s, m, m+s, max(xx) )
>
> if(coef == 0 ) do.out <- FALSE
> #for compatibility with boxplot.stats
>
> if (do.out) {
> out <- abs(xx-mean(xx))/s > (coef+1)
> #coef+1 gives outliers outside mean+-2.5s,
> because bwplot
> #passes its default coef=1.5 to the stats
> function and outlier
> #marking at 2.5s is not a million miles from
> boxplot.stats's
> #lower/upper quartiles -/+ 1.5*iqr if normality
> is assumed
> } else {
> out <- numeric(0)
> }
>
> if (any(out))
> stats[c(1, 5)] <- range(xx[!out])
>
> #and tidy up any silly whiskers... mean+-sd can be outside the
> outer data points
> stats[1]<-min(stats[1:2])
> stats[5]<-max(stats[4:5])
>
> conf <- if (do.conf && n>1)
> stats[3] + c(-1,1) * s * qt(1-p/2, n-1)/sqrt(n)
> #Note: this is simply the (1-p)% confidence interval, not the
> notch width
> #required for a pairwise test at (1-p)% confidence. If notches
> don't overlap, though,
> #you certainly have a significant difference at _at least_ the
> (1-p)% level.
> #But bwplot can't use it anyway, 'cos it doesn't do notches.
>
> list(stats = stats, n = n, conf = conf, out = xx[out] )
> }
>
>
> ##Try it out...
> require(lattice)
> x<-rnorm(100)
> g<-gl(5,20)
> bwplot(x~g, main="The default")
>
> windows()
> bwplot(x~g, stats=boxplot.norm, main="Mean +- SD")
>
>
>
>
>
>>>> Chad Junkermeier <junkermeier at byu.edu> 05/08/2008 05:36 >>>
> I really like the ease of use with the boxplot command in R. I would
>
> rather have a boxplot that shows the average value and the standard
> deviation then the median value and the quartiles.
>
> Is there a way to do this?
>
>
> Chad Junkermeier, Graduate Student
> Dept. of Physics
> West Virginia University
> PO Box 6315
> 210 Hodges Hall
> Morgantown WV 26506-6315
> phone: (304) 293-3442 ext. 1430
> fax: (304) 293-5732
> email: chad.junkermeier{at}mail.wvu.edu
> -----------------------------------------------------
> Concurrently at:
> Dept. of Physics and Astronomy
> Brigham Young University
> Provo UT 84602
> email: junkermeier{at}byu.edu
>
> cell: (801) 380-8895
>
--
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
More information about the R-help
mailing list