David Winsemius dwinsemius at comcast.net
Wed Oct 30 18:17:42 CET 2013

On Oct 30, 2013, at 4:07 AM, Dan Abner wrote:

> Hi everybody,
> I have data in the format of the example data below where essentially a
> large number of indicator variables (coded [0,1]) reflect traits of the
> same id across multiple rows. I need to represent the data in a 1 row per
> id format. I see this as being similar to converting from long to wide
> format, however, there is no time component here: The multiple rows here
> are all characteristics observed at the same measurement occasion. So,
> really I just need an individual sum for each variable (for a large number
> of variables) and for these to be all saved in the same row (along with the
> id variable and other demographics (e.g., "location").
> Here is the example df and the method I used first:
> d1<-data.frame(id=c(1,1,1,2,2,2,2,3,3,4),location=factor(c(rep(0,7),rep(1,3)),
> labels=c("A","B")),var1=as.logical(round(runif(10))),
> var2=as.logical(round(runif(10))),var3=as.logical(round(runif(10))))
> d1


> mysum<-aggregate(d1[-(1:2)],by=d1[1:2] ,sum)
> mysum
  id location var1 var2 var3
1  1        A    0    2    1
2  2        A    1    2    1
3  3        B    1    0    2
4  4        B    1    1    0

