[R] summarizing a dataset on a factor

David Carlson dcarlson at tamu.edu
Thu Mar 27 22:06:00 CET 2014


It may be possible to do this in a single step, but

> x1 <- aggregate(response~id+age, data, mean)
> x2 <- data[data$eye=="l", c("id", "response2")]
> merge(x1, x2)
  id age response response2
1  1   2     4.60      High
2  2   9     2.65      High
3  3   5     3.65      High
4  4   2     7.55      High
5  5  11     4.15      High

-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352

-----Original Message-----
From: r-help-bounces at r-project.org
[mailto:r-help-bounces at r-project.org] On Behalf Of Tom Wright
Sent: Thursday, March 27, 2014 3:48 PM
To: r-help at r-project.org
Subject: [R] summarizing a dataset on a factor

Hi all,
I've spent too long in matlab land recently and seem to have
forgotten
my R skillz ;-)
I'm sure I'm missing a simple way to do this...

Given a data frame
id<-rep(1:5,2)
eye<-c(rep('l',5),rep('r',5))
age<-rep(round(runif(5,0,12)),2)
response<-round(runif(10,1,10)*10)/10
response2<-sample(c('High','Low'),10,replace=TRUE)

data<-data.frame(id,eye,age,response,response2)

I want to create a new dataset averaging the response variable
from both
eyes for each test.

I know there are many ways to do this but... I would also like
to keep
the value of the response2 variable for the left eye.

ending up with the dataset
id	age	response	response2
1	9	3.65	High
2	10	3.85	High
3	8	8.15	Low
4	4	4.4	Low
5	0	4.6	High

I thought something like
f<-function(x){#make my choices here}
aggregate(data,list(data$id),f)

but x only seems to contain the first column of data.

I could probably have done this manually in the time spent
writing this
email.
any help appreciated, 
Tom

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible
code.




More information about the R-help mailing list