[R] Percentiles with R for a big data.frame

Duncan Murdoch murdoch.duncan at gmail.com
Tue Jan 22 00:54:36 CET 2013


On 13-01-21 6:41 PM, Simonas Kecorius wrote:
> Dear R users,
>
> I came up to a problem dealing with percentiles in R.
>
>>From my previous questions: I do have a big data.frame, with lots of
> columns and rows. The following command enables me to calculate means for
> all data frame.
>
> dat1$newID<-rep(1:(nrow(dat1)/12),each=12) #if nrow(dat1)/12 is integer
>
> dat2<-with(dat1,aggregate(cbind(dat1[,1:71]),by=list(newID),mean))
>
> What I need is to calculate percentiles for each group (there are 12 values
> in a group). I tried the following:
>
> duomenai<-with(dat1,aggregate(cbind(dat1[,1:71]),by=list(newID),quantiles,0.1,type=4))

You didn't define quantiles, so that won't work.  Assuming that's a 
typo, and you meant quantile...
>
>
> First, is the following syntax is right?
> Secondly, I tried to calculate percentiles using OpenOffice and there is
> disagreement between values. If I do calculation for some number row, than
> R and OpenOffice numbers coincide, but for a data.frame it seams that
> something goes wrong.

There are lots of different formulas for empirical quantiles.  The ones 
available in R are described in the ?quantile help topic.  What formula 
does OpenOffice use?

Duncan Murdoch



More information about the R-help mailing list