[R] extracting index list when using tapply()

Charles C. Berry cberry at tajo.ucsd.edu
Wed Jul 9 01:33:53 CEST 2008

On Tue, 8 Jul 2008, hesicaia wrote:

> Hello,
>  The quick version of my question is how can I extract a matrix instead of
> a vector using tapply()? I would like to be able to access both the results
> of tapply() and also the index variables.
> In case further explanation would help:  I am analyzing a large (3million
> rows x 9 columns) spatial/temporal dataset and am attempting to calculate
> the number of unique years containing any data within each geographic area
> (10 degree cells in this case). I can do this, but I also want to extract a
> subset vector of the index variable (area).

It really would help to provide a worling example as another suggested. We 
cannot test our suggestions without a trial dataset.

> My script to calculate the number of unique years containing any data for
> each area is:
> x<-tapply(years, area, function(x) length(unique(x)))


 	tab <- table( area, years )
 	x <- rowSums ( tab !=0  )

> Now, I want to extract the vector of areas where the number of unique years
> containing any data is >20, but tapply() only returns a vector of unique
> years and I was a matrix.

 	x <- rownames(tab)[ rowSums( tab !=0 ) > 20 ]

unless, perhaps, you meant

 	x <- rownames(tab)[ rowSums( tab > 20 ) !=0 ]

> I could use a looping function to do this, but tapply() is much faster with
> large datasets and so I would like to use it if possible.

Depending on the size of the dataset and the number of different years and 
areas, there may be better ways to do this (since 'tab' could be very big 
and sparse). For a start in that direction, see


and perhaps library(Matrix) (on CRAN).



> Any help is appreciated.
> Thanks.
> -- 
> View this message in context: http://www.nabble.com/extracting-index-list-when-using-tapply%28%29-tp18345794p18345794.html
> Sent from the R help mailing list archive at Nabble.com.
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Charles C. Berry                            (858) 534-2098
                                             Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu	            UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

More information about the R-help mailing list