[R] calculate row median of every three columns for a dataframe
David McPearson
dmcp @end|ng |rom webm@||@co@z@
Fri Apr 17 07:49:54 CEST 2020
Anna wrote:
> Hi all,
> I need to calculate a row median for every three columns of a
> dataframe. I made it work using the following script, but not happy
> with the script. Is there a simpler way for doing this?
To which Jim L responded:
> Hi Anna,
> I can't think of a simple way, but this function may make you happier:
> step_median<-function(x,window) {
> x<-unlist(x)
> stop<-length(x)-window+1
> xout<-NA
> nindx<-1
> for(i in seq(1,stop,by=window)) {
> xout[nindx]<-do.call("median",list(x[i:(i+window-1)]))
> nindx<-nindx+1
> }
> return(xout)
> }
> apply(df,1,step_median,3)
> This should return a matrix where the columns are the medians
> calculated from blocks of "window" width on each row of "df". As Bert
> noted, you may want to think about a "rolling" median where the
> "windows" overlap. This can be done like so:
> library(zoo)
> apply(df,1,rollmedian,3)
> Jim
Another approach you might try is multiple calls to sapply/lapply. This
won't rid you of loops, but it will hide them:
# Example data. Some names changed to avoid collisions between
# R functions (collisions are in the gap between the headphones,
# not i R).
dfr <- data.frame(a = c(2,3,4), b = c(3,5,1), c = c(1,3,6),
d = c(7,2,1), e = c(2,5,3), f = c(4,5,1))
# Turn each of the three-column groups into their own element
# in a list. Note: the subsetting (probably) fails with an
# error if ncol(dfr) is not a multiple of 3
dlist <- lapply(seq(1, ncol(dfr), by = 3), function(enn)
dfr[ , enn + 0:2])
# Then you can use sapply to calculate the row medians for each
# of the elements..
# Both of the following seem to work. I'm not sure which is
# more readable…
sapply(dlist, function(xx) apply(xx, 1, median))
sapply(dlist, apply, 1, median)
# I'm sure the cognoscenti will have a much more elegant way
# of doing this.
Cheers y'all,
More information about the R-help
mailing list