[R] Need to aggregate large dataset by week...

John Kane jrkrideau at inbox.com
Sat Feb 11 17:12:36 CET 2012


If I understand what you want here are two possible ways to approach the problem.  One uses aggregate and one uses the reshape package to melt and cast the data into the form you want.

To use reshape you need to install the reshape package.

Assuming your dataset is named xx

aggregate(xx, by=list(xx$week), mean)

library(reshape)
mm <- melt(xx, id=c("week"))
cast(mm, week ~ variable, mean)




John Kane
Kingston ON Canada


> -----Original Message-----
> From: revdan20 at gmail.com
> Sent: Fri, 10 Feb 2012 04:55:44 -0800 (PST)
> To: r-help at r-project.org
> Subject: [R] Need to aggregate large dataset by week...
> 
> Hi all,
> 
> I have a large dataset with ~8600 observations that I want to compress to
> weekly means. There are 9 variables (columns), and I have already added a
> "week" column with 51 weeks. I have been looking at the functions:
> aggregate, tapply, apply, etc. and I am just not savvy enough with R to
> figure this out on my own, though I'm sure it's fairly easy. I also have
> the
> Dates (month/day/year) for all of the observations, but I figured just
> having a week column may be easier. If someone wanted to show me how to
> organize this data using a date function and aggregating by month that
> would
> be useful too!
> 
> Here's an example of the data set, with only 5 of the variables and 10 of
> 8600 obs.:
> 
>      week    rainfall windspeed   winddir      temp   oakdepth
> 1       1  0.20000000   0.89000  245.9200  1.150000   4.400000
> 2       1  0.00000000   0.84000  292.8800  1.190000   5.300000
> 3       1  0.20000000   0.74000  258.5400  1.360000   6.000000
> 4       1  0.00000000   0.93000    3.7000  1.430000   4.400000
> 5       1  0.20000000   0.69000   37.8200  1.560000   5.200000
> 6       1  0.00000000   0.80000   17.2900  1.690000   4.400000
> 7       1  0.20000000   0.70000   28.7300  1.880000   5.000000
> 8       1  0.20000000   1.12000  294.3700  1.930000   6.000000
> 9       1  0.00000000   1.21000  274.9700  1.800000   4.400000
> 10      1  0.00000000   1.31000  279.2400  1.860000   5.800000
> 
> ...so after about 170 observations it changes to week 2, and so on.
> 
> I've tried something like this, but its only one variable's mean, and I
> would rather have the rows=weeks and columns= the different variables.
> 
> < tapply(metdata$rainfall,metdata$week,FUN=mean)
>           1           2           3           4           5           6
> 0.080952381 0.101190476 0.379761905 0.179761905 0.000000000 0.295238095
>           7           8           9          10          11          12
> 0.146428571 0.015476190 0.163888889 0.098809524 0.065476190 0.215476190
> 
> Hope this is enough information and that I'm not just re-asking an old
> question. Thanks so much in advance for any help.
> 
> 
> 
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Need-to-aggregate-large-dataset-by-week-tp4376154p4376154.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

____________________________________________________________
Send your photos by email in seconds...
TRY FREE IM TOOLPACK at http://www.imtoolpack.com/default.aspx?rc=if3
Works in all emails, instant messengers, blogs, forums and social networks.



More information about the R-help mailing list