[R] subset grouped data with quantile and NA's

David Carslaw d.c.carslaw at its.leeds.ac.uk
Fri Aug 22 09:35:29 CEST 2008

I can't quite seem to solve a problem subsetting a data frame.  Here's a
reproducible example. 

Given a data frame:

dat <- data.frame(fac = rep(c("a", "b"), each = 100),
                  value = c(rnorm(130), rep(NA, 70)),
                  other = rnorm(200))

What I want is a new data frame (with the same columns as dat) excluding the
top 5% of "value" separately by "a" and "b". For example, this produces the
results I'm after in an array:

sub <- tapply(dat$value, dat$fac, function(x) x[x < quantile(x, probs =
0.95, na.rm = TRUE)]) 

My difficulty is putting them into a data frame along with the other columns
"fac" and "other". Note that quantile will return different length vectors
due to different numbers of NAs for a and b.

There's something I'm just not seeing - can you help?

Many thanks.

David Carslaw

Institute for Transport Studies
University of Leeds
View this message in context: http://www.nabble.com/subset-grouped-data-with-quantile-and-NA%27s-tp19102795p19102795.html
Sent from the R help mailing list archive at Nabble.com.

More information about the R-help mailing list