[R] crosstabulation and unlist function

David Winsemius dwinsemius at comcast.net
Mon Oct 12 21:36:39 CEST 2009


On Oct 12, 2009, at 3:25 PM, David Winsemius wrote:

>
> On Oct 12, 2009, at 2:36 PM, eugen pircalabelu wrote:
>
>> Hello R-users,
>>
>> My toy example:
>> aa<-c(1:5)
>> bb<-c(NA,2,NA,4,5)
>> cc<-c(1,2,NA,4,NA)
>> dd<-c("A","B","B","A","C")
>> df<-data.frame(aa,bb,cc,dd=as.factor(dd))
>> table(unlist(df[,1:3]))
>>
>> Can anyone point me to what function let's me do a crosstabulation  
>> between   table(unlist(df[,1:3])) and df$dd?
>> I want to find out when dd==A (or B, or C) how many times do the  
>> values 1, 2 ,3,..  appear in df[,1:3]?
>> Thank you very much!
>
> One way would be to collect the row sums of those columns first, and  
> then sum by index:
>
> tapply(apply(df[,1:3],1,sum, na.rm=TRUE), df$dd, sum)
> A  B  C
> 14  9 10

This method is safer than working on table(unlist(df[, 1:3]) since it  
does not "break" when an entire row is empty.

 > aa<-c(1,2,NA,4,5)
 > bb<-c(NA,2,NA,4,5)
 > cc<-c(1,2,NA,4,NA)
 > dd<-c("A","B","B","A","C")
 > df<-data.frame(aa,bb,cc,dd=as.factor(dd))
 > table(unlist(df[,1:3]))

1 2 4 5
2 3 3 2     # missing row willno longer be aligned with "dd".
 > tapply(table(unlist(df[,1:3])), df$dd, sum)
Error in tapply(table(unlist(df[, 1:3])), df$dd, sum) :
   arguments must have same length

 > tapply(apply(df[,1:3],1,sum, na.rm=TRUE), df$dd, sum)
  A  B  C
14  6 10


>
> -- 
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT




More information about the R-help mailing list