[R] Percentiles with R for a big data.frame
David Winsemius
dwinsemius at comcast.net
Fri Jan 25 16:23:24 CET 2013
On Jan 23, 2013, at 5:45 AM, Simonas Kecorius wrote:
> I found a code:
>
> y.ts <- ts(data, frequency=12)
> aggregate(y.ts, FUN=quantile, probs=0.10)
>
> Seems it works fine even for a big data.frame.
Except for the fact that 'y.ts' is not a dataframe, so you are using a
function that has different arguments than `aggregate.data.frame`.
With the `ts` call you implicitly constructed `ts(data.matrix(data),
frequency=12)` and will be getting quantile estimates on groups of 12,
which is not at all what you asked for in the first place.
--
David.
>
> Thanks for your help.
>
> 2013/1/22 David Winsemius <dwinsemius at comcast.net>
>
> On Jan 22, 2013, at 5:58 AM, Simonas Kecorius wrote:
>
> Hey Duncan,
>
> Neither me do imagine what formula OpenOffice uses for quantiles. I
> have
> checked a data string, 24 values, to calculate a quantiles with
> OpenOffice
> and R. The result is identical. The problem arises when I try to
> implement
> quantile calculation in this form:
> dat2<-with(dat1,aggregate(cbind(dat1[,
> 1:71]),by=list(newID),quantiles,0.1,type=4))
> . This code does not generate an error, but I guess neither a right
> result.
>
> You guess? What result and what is "right"?
>
>
> So my question would be:
> How I could calculate quantiles for a big data.frame in R (71
> columns and
> 288 rows). I need to take 24 rows, calculate quantiles, then take
> another
>
> 24 rows etc..for 71 columns.
>
>
> You have already been told that you are misspelling the name of the
> R function.
>
> The other open question in my mind is whether you were hoping for
> something other than a single quantile (in this case the 10th
> percentile, or perhaps wanted the quantiles that would divide your
> data into deciles?
>
> If you want to do the calculation within groups then the second
> argument to `aggregate` must specify the grouping. By design
> `aggregate` will apply the function on all columns.
> --
> David.
>
> Thanks in advance.
>
>
>
>
> 2013/1/22 Duncan Murdoch <murdoch.duncan at gmail.com>
>
> On 13-01-21 6:41 PM, Simonas Kecorius wrote:
>
> Dear R users,
>
> I came up to a problem dealing with percentiles in R.
>
> From my previous questions: I do have a big data.frame, with lots of
>
> columns and rows. The following command enables me to calculate
> means for
> all data frame.
>
> dat1$newID<-rep(1:(nrow(dat1)/**12),each=12) #if nrow(dat1)/12 is
> integer
>
> dat2<-with(dat1,aggregate(**cbind(dat1[,1:71]),by=list(**newID),mean))
>
>
> What I need is to calculate percentiles for each group (there are 12
> values
> in a group). I tried the following:
>
> duomenai<-with(dat1,aggregate(**cbind(dat1[,1:71]),by=list(**
> newID),quantiles,0.1,type=4))
>
>
> You didn't define quantiles, so that won't work. Assuming that's a
> typo,
> and you meant quantile...
>
>
>
> First, is the following syntax is right?
> Secondly, I tried to calculate percentiles using OpenOffice and
> there is
> disagreement between values. If I do calculation for some number
> row, than
> R and OpenOffice numbers coincide, but for a data.frame it seams that
> something goes wrong.
>
>
> There are lots of different formulas for empirical quantiles. The
> ones
> available in R are described in the ?quantile help topic. What
> formula
> does OpenOffice use?
>
> Duncan Murdoch
>
>
>
>
> --
> Simonas Kecorius
> **
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> Alameda, CA, USA
>
>
>
>
> --
> Simonas Kecorius
>
David Winsemius, MD
Alameda, CA, USA
More information about the R-help
mailing list