[R] column means dropping column minimum
Evan Cooch
evan.cooch at gmail.com
Tue Dec 8 15:29:35 CET 2015
Suppose I have something like the following dataframe:
samp1 <- c(60,50,20,90)
samp2 <- c(60,60,90,58)
samp3 <- c(25,65,65,90)
test <- data.frame(samp1,samp2,samp3)
I want to calculate column means. Easy enough, if I want to use all the
data within each column:
print(colMeans(test),na.rm = TRUE)
However, I'm danged if I can figure out how to do the same thing after
dropping the minimum value for each column. For example, column 1 in the
dataframe test consists of 60, 50,20,90. I want to calculate the mean
over (60,50,90), dropping the minimum value (20). Figuring out what the
minimum value is in a single column is easy, but I can't figure out how
to arm-twist colMeans into 'applying itself' to the elements of a column
greater than the minimum, for each column in turn. I've tried
permutations of select, subset etc., to no avail. Only thing I can think
of is to (i) find the minimum in a column, (ii) change it to NA, and
then (iii) tell colMeans to na.rm = TRUE):
test2 <- test
for (i in 1:ncol(test)) { test2[which.min(test[,i]),i]==NA}
print(test2)
print(colMeans(test2),na.rm = TRUE)
While this works, seems awfully 'clunky' -- is there a better way?
Thanks in advance...
More information about the R-help
mailing list