[R] remove rows in data frame by average
David Winsemius
dwinsemius at comcast.net
Fri Feb 22 20:06:40 CET 2013
On Feb 21, 2013, at 2:15 PM, William Dunlap wrote:
> Many find the functions in the plyr package more convenient to use than the
> do.call(rbind, lapply(split(...),...) business:
>
>> library(plyr)
>> ddply(dat1, .(Subject,Block), summarize, MeanFeature1=mean(Feature1), MeanFeature2=mean(Feature2))
> Subject Block MeanFeature1 MeanFeature2
> 1 1 1 55.0 29.0
> 2 1 2 42.5 38.5
> 3 1 3 64.0 14.0
>
> Change the calls to 'mean' to calls to other summary functions like 'sum' or 'max' as you wish.
Apropos something less complex than "the do.call( lapply( split...)) business":
The same sort of operation is provided by `aggregate` when the function to be applied on all columns is the same:
> aggregate(dat1[, c('Feature1', 'Feature2')] , dat1[, c("Subject", "Block")], FUN=mean)
Subject Block Feature1 Feature2
1 1 1 55.0 29.0
2 1 2 42.5 38.5
3 1 3 64.0 14.0
--
David
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
>> Of arun
>> Sent: Thursday, February 21, 2013 1:45 PM
>> To: Johannes Brand
>> Cc: R help
>> Subject: Re: [R] remove rows in data frame by average
>>
>> Hi,
>>
>> May be this helps:
>>
>> dat1<- read.table(text="
>> Subject Block Trial Feature1 Feature2
>> 1 1 1 48 40
>> 1 1 2 62 18
>> 1 2 1 34 43
>> 1 2 2 51 34
>> 1 3 1 64 14
>> ",sep="",header=TRUE)
>>
>>
>> res1<-do.call(rbind,lapply(split(dat1,dat1$Block),function(x)
>> data.frame(unique(x[,1:2]),t(colMeans(x[,-c(1:3)])))))
>> res1
>> # Subject Block Feature1 Feature2
>> #1 1 1 55.0 29.0
>> #2 1 2 42.5 38.5
>> #3 1 3 64.0 14.0
>>
>>
>> #With multiple subjects:
>> dat2<- read.table(text="
>> Subject Block Trial Feature1 Feature2
>> 1 1 1 48 40
>> 1 1 2 62 18
>> 1 2 1 34 43
>> 1 2 2 51 34
>> 1 3 1 64 14
>> 2 1 1 48 35
>> 2 1 2 54 15
>> 2 2 1 49 50
>> 2 2 2 64 40
>> 2 3 1 38 28
>> ",sep="",header=TRUE)
>>
>> res2<-do.call(rbind,lapply(split(dat2,list(dat2$Subject,dat2$Block)),function(x)
>> data.frame(unique(x[,1:2]),t(colMeans(x[,-c(1:3)])))))
>> res2<-do.call(rbind,split(res2,res2$Subject))
>> res2
>> # Subject Block Feature1 Feature2
>> #1 1 1 55.0 29.0
>> #2 1 2 42.5 38.5
>> #3 1 3 64.0 14.0
>> #4 2 1 51.0 25.0
>> #5 2 2 56.5 45.0
>> #6 2 3 38.0 28.0
>>
>>
>>
>> A.K.
>>
>>
>>
>> ----- Original Message -----
>> From: Johannes Brand <brandjohannes at gmx.de>
>> To: r-help at r-project.org
>> Cc:
>> Sent: Thursday, February 21, 2013 12:02 PM
>> Subject: [R] remove rows in data frame by average
>>
>> Dear all,
>>
>> I have a data frame, which looks like this:
>>
>> Subject | Block | Trial | Feature1 | Feature2 ....
>> 1 | 1 | 1 | ... | ...
>> 1 | 1 | 2 | ... | ...
>> 1 | 2 | 1 | ... | ...
>> 1 | 2 | 2 | ... | ...
>> 1 | 3 | 1 | ... | ...
>> ...| ...| ...| ... | ...
>>
>> Can I remove the "Trial" column by averaging all the rows and without using
>> a "for loop"?
>>
>> At the end my data frame should look like this:
>>
>> Subject | Block | Feature1 | Feature2 ....
>> 1 | 1 | ... | ...
>> 1 | 2 | ... | ...
>> 1 | 3 | ... | ...
>> ...| ...| ... | ...
>>
>> Thank you a lot for your help.
>>
>> Best,
>> Johannes
>>
David Winsemius
Alameda, CA, USA
More information about the R-help
mailing list