[R] question about the aggregate function with respect to order of levels of grouping elements

Gabor Grothendieck ggrothendieck at gmail.com
Sun Dec 16 15:54:50 CET 2007


In fact, even ordinary aggegate works ok with zoo's as.yearmon:

> aggregate(rnum, list(dts = as.yearmon(dts)), sum)
        dts           x
1  Jan 2001  4.43610085
2  Feb 2001  0.49842227
3  Mar 2001  7.52139932
4  Apr 2001  1.47917343
5  May 2001 10.64459923
6  Jun 2001 -1.22530586
7  Jul 2001  8.19563685
8  Aug 2001  1.57626974
9  Sep 2001  1.28842871
10 Oct 2001  2.50540074
11 Nov 2001  0.71156951
12 Dec 2001  0.54118342
13 Jan 2002 -0.41292840
14 Feb 2002 -2.41301496
15 Mar 2002  3.23783551
16 Apr 2002  0.63914807
17 May 2002 -1.46357402
18 Jun 2002  2.91651492
19 Jul 2002  2.17263290
20 Aug 2002 -2.30981022
21 Sep 2002 -9.60701788
22 Oct 2002  1.16504368
23 Nov 2002 -3.07038254
24 Dec 2002  1.38281927
25 Jan 2003  4.48761479
26 Feb 2003  2.42455090
27 Mar 2003 -0.03743888
28 Apr 2003  1.11223001
29 May 2003 -4.07988016
30 Jun 2003 -1.15116293
31 Jul 2003 -7.15292576
32 Aug 2003 -2.34231702
33 Sep 2003 -0.48132751
34 Oct 2003 11.74252191
35 Nov 2003  2.51063034
36 Dec 2003 -4.35801058


On Dec 16, 2007 9:50 AM, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
> This does look strange.  Note that aggregate.zoo in the zoo package
> would work here:
>
> > library(zoo)
> > aggregate(zoo(rnum, dts), as.yearmon, sum)
>   Jan 2001    Feb 2001    Mar 2001    Apr 2001    May 2001    Jun 2001
>  4.43610085  0.49842227  7.52139932  1.47917343 10.64459923 -1.22530586
>   Jul 2001    Aug 2001    Sep 2001    Oct 2001    Nov 2001    Dec 2001
>  8.19563685  1.57626974  1.28842871  2.50540074  0.71156951  0.54118342
>   Jan 2002    Feb 2002    Mar 2002    Apr 2002    May 2002    Jun 2002
> -0.41292840 -2.41301496  3.23783551  0.63914807 -1.46357402  2.91651492
>   Jul 2002    Aug 2002    Sep 2002    Oct 2002    Nov 2002    Dec 2002
>  2.17263290 -2.30981022 -9.60701788  1.16504368 -3.07038254  1.38281927
>   Jan 2003    Feb 2003    Mar 2003    Apr 2003    May 2003    Jun 2003
>  4.48761479  2.42455090 -0.03743888  1.11223001 -4.07988016 -1.15116293
>   Jul 2003    Aug 2003    Sep 2003    Oct 2003    Nov 2003    Dec 2003
> -7.15292576 -2.34231702 -0.48132751 11.74252191  2.51063034 -4.35801058
>
>
>
> On Dec 16, 2007 9:23 AM, tom soyer <tom.soyer at gmail.com> wrote:
> > Hi,
> >
> > I am using aggregate() to add up groups of data according to year and month.
> > It seems that the function aggregate() automatically sorts the levels of
> > factors of the grouping elements, even if the order of the levels of factors
> > is supplied. I am wondering if this is a bug, or if I missed something
> > important. Below is an example that shows what I mean. Does anyone know if
> > this is just the way the aggregate function works, or are there ways
> > to force aggregate() to keep the order of levels of factors supplied by the
> > grouping elements? Thanks!
> >
> > library(chron)
> > dts=seq.dates("1/1/01","12/31/03")
> > rnum=rnorm(1:length(dts))
> > df=data.frame(date=dts,obs=rnum)
> > agg=aggregate(df[,2],list(year=years(df[,1]),month=months(df[,1])),sum)
> > levels(agg$month) # aggregate() automatically generates levels sorted by
> > alphabet.
> >
> > [1] "Apr" "Aug" "Dec" "Feb" "Jan" "Jul" "Jun" "Mar" "May" "Nov" "Oct" "Sep"
> >
> > fmonth=factor(months(df[,1]))
> > levels(fmonth) # factor() automatically generates the correct order of
> > levels.
> >
> > [1] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"
> >
> >
> > agg2=aggregate(df[,2],list(year=years(df[,1]),month=fmonth),sum)
> > levels(agg2$month) # even if a factor with levels in the correct order is
> > supplied, aggregate(), sortsthe levels by alphabet regardless.
> >
> > [1] "Apr" "Aug" "Dec" "Feb" "Jan" "Jul" "Jun" "Mar" "May" "Nov" "Oct" "Sep"
> >
> >
> > --
> > Tom
> >
> >        [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>



More information about the R-help mailing list