[R] time-series aggregation of information

David Winsemius dwinsemius at comcast.net
Fri May 17 21:32:52 CEST 2013


On May 17, 2013, at 11:48 AM, Chirag Maru wrote:

> I have following data for which I need to calculate the weighted aggregate value of the parameter at each time.
> 
> Date,Parameter,Weight
> 2012-01-31,90,200
> 2012-01-31,80,400
> 2012-01-31,70,500
> 2012-01-31,60,800
> 2012-02-29,120,220
> 2012-02-29,110,410
> 2012-02-29,75,520
> 2012-02-29,65,840
> 2012-03-31,115,210
> 2012-03-31,100,405
> 2012-03-31,70,500
> 2012-03-31,60,800
> 
> So for the above sample the solution looks like:
> 
> Date,Weighted Parameter
> 2012-01-31,70
> 2012-02-29,82.96482412
> 2012-03-31,77.10182768
> 
> Could I potentially use tapply / aggregate for this?  Would like to avoid a for loop if possible.

> by(dat, dat[1], FUN=function(d) weighted.mean(d[["Parameter"]], w=d[["Weight"]]) )
Date: 2012-01-31
[1] 70
------------------------------------------------------------ 
Date: 2012-02-29
[1] 82.96482
------------------------------------------------------------ 
Date: 2012-03-31
[1] 77.10183

It's a bit of a shame that there is no as.data.frame.by function. You can create a dataframe from the by object with as.data.frame.table with the only defect in the naming of the second column

as.data.frame.table(by(dat, dat[1], FUN=function(d) weighted.mean(d[["Parameter"]], w=d[["Weight"]]) ))

        Date     Freq
1 2012-01-31 70.00000
2 2012-02-29 82.96482
3 2012-03-31 77.10183

 setNames(as.data.frame.table(by(dat, dat[1], FUN=function(d) weighted.mean(d[["Parameter"]], w=d[["Weight"]]) )), 
          c("Dts", "wtdmeans"))
         Dts wtdmeans
1 2012-01-31 70.00000
2 2012-02-29 82.96482
3 2012-03-31 77.10183


-- 

David Winsemius
Alameda, CA, USA



More information about the R-help mailing list