[R] A function for plotting a boxplot with added dot and bars (formean and SE) - please help improve my code

ONKELINX, Thierry
Wed Aug 12 12:29:37 CEST 2009

This can be done much easier and transparent with ggplot2

ggplot(mtcars, aes(x = factor(round(wt)), y = mpg, colour = factor(am)))
+ geom_boxplot() + geom_point(stat = "summary", fun.y = "mean", position
= position_dodge(width = 0.75)) 



Hello people,

A while back I wanted to plot boxplots with interactions that will have
a dot for the mean of the sample + bars for the SE.
After searching for some code, I found something that did it for one
level, but couldn't find something that will allow for interactions the
way the original boxplot does.

After playing with the original code, I found a way for allowing the
boxplot code to introduce interactions to it. The price for that was
that I was forced to use a little different syntax for the function then
that of the original boxplot. In order to use the function, one must use
lists for the function input arguments, and until now I didn't find a
way for doing the same task from a formula input.
So for example, the original boxplot will be written like this:
boxplot(y ~ A*B)
Where as my function will look like this:
boxplot.2(y , list(A,B) )

In this e-mail I am giving away the code for:
1) helping out others searching for this solution. and
2) in hope to have more experienced R programmers come by and improve on
this code (by, for example, removing the for loop in it, or allowing to
use a formula instead of a list input)

Here is the code:

boxplot.2 <- function(fo.head, list.fo.tail = list(1), print.mean = T,
plot.CI = T, add.mean.sd.to.boxplot.names = T, plot.round.factor = 2
,...) {
  tmp   <- split(fo.head ,  list.fo.tail)
  means <- sapply(tmp, mean)
  stdev <- sqrt(sapply(tmp, var))  #IS right - because of the sqrt !!!
<- sqrt(sapply(tmp, var))  # was var, I changed it to sd
  n     <- sapply(tmp,length)
  ciw   <- qt(0.975, n-1) * stdev / sqrt(n)
  old.names = attributes(tmp)$names
  length.of.names = length(old.names)
  new.names = as.list(old.names)
  for(i in c(1:length.of.names))
    new.names[[i]] <- paste(old.names[i], " (",
round(means[i],plot.round.factor) ,",", round(ciw[i],plot.round.factor)
,")", sep = "")

           if(length(list.fo.tail) == 1)
             sub.text = paste("mean:", round(means,3),
"(SE:",round(ciw,3), " ; N:" ,n, ")")
             boxplot(tmp, xlab = sub.text, cex.axis = min(1,
max(nchar(new.names)) ))  ), ...)
           } else {
             boxplot(tmp, names = new.names, cex.axis = min(1,
max(nchar(new.names)) ))  ), ...)
  } else {  boxplot(tmp, ...) } # else - make boxplot without them

  # adding the points and the bars
  points(means, col = 'red', pch = 19)
  if(require(gplots) & plot.CI) {
  plotCI(x=means, uiw=ciw, col="white", barcol="blue",
         xaxt="n" , add = T)

  if(print.mean) {
      small.mean.and.se.table <- rbind(round(means,3),round(ciw,3), n)
      rownames(small.mean.and.se.table) <- c("means:", "SE's:", "N")
} # <-- end boxplot3

# Here is a small example:
boxplot.2(mpg , list(round(wt), am) , data = mtcars, las = 2)

Hope this will benefit others, and also hope that others will improve on
this code and give it back to the community, Tal galili


