[R] Take average of previous weeks

Bert Gunter bgunter.4567 at gmail.com
Sun Mar 25 22:05:11 CEST 2018


I am sure that this sort of thing has been asked and answered before,
so in case my suggestions don't work for you, just search the archives
a bit more.
I am also sure that it can be handled directly by numerous functions
in numerous packages, e.g. via time series methods or by calculating
running means of suitably shifted series.

However, as it seems to be a straightforward task, I'll provide what I
think is a simple solution in base R. Adjust to your situation.

## First I need a little utility function to offset rows. Lots of ways
to do this,many nicer than this I'm sure.

> shift <- function(x,k)
+    ## x is a vector of values -- e.g. of a column in your df
+ {
+    sapply(seq_len(k),function(i)c(rep(NA,i),head(x,-i)))
+ }
>
>
> ## Testit
> x <- c(1,3,5,7,8:11)
> m <- shift(x,3) ## matrix of prior values up to lag 3
> m ## note rows have been omitted where lags don't exist
     [,1] [,2] [,3]
[1,]   NA   NA   NA
[2,]    1   NA   NA
[3,]    3    1   NA
[4,]    5    3    1
[5,]    7    5    3
[6,]    8    7    5
[7,]    9    8    7
[8,]   10    9    8
> rowMeans(m) ## means of previous 3
[1]       NA       NA       NA 3.000000 5.000000 6.666667 8.000000 9.000000
> rowMeans(m[,1:2]) ## means of previous 2
[1]  NA  NA 2.0 4.0 6.0 7.5 8.5 9.5


Cheers,
Bert






Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sun, Mar 25, 2018 at 7:48 AM, Miluji Sb <milujisb at gmail.com> wrote:
> Dear all,
>
> I have weekly data by city (variable citycode). I would like to take the
> average of the previous two, three, four weeks (without the current week)
> of the variable called value.
>
> This is what I have tried to compute the average of the two previous weeks;
>
> df = df %>%
>   mutate(value.lag1 = lag(value, n = 1)) %>%
>   mutate(value .2.previous = rollapply(data = value.lag1,
>                                      width = 2,
>                                      FUN = mean,
>                                      align = "right",
>                                      fill = NA,
>                                      na.rm = T))
>
> I crated the lag of the variable first and then attempted to compute the
> average but this does not seem to to what I want. What I am doing wrong?
> Any help will be appreciated. The data is below. Thank you.
>
> Sincerely,
>
> Milu
>
> dput(droplevels(head(df, 10)))
> structure(list(year = c(1970L, 1970L, 1970L, 1970L, 1970L, 1970L,
> 1970L, 1970L, 1970L, 1970L), citycode = c(1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L), month = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
> 2L, 3L), week = c(1L, 2L, 3L, 4L, 5L, 5L, 6L, 7L, 8L, 9L), date =
> structure(c(1L,
> 2L, 3L, 4L, 5L, 5L, 6L, 7L, 8L, 9L), .Label = c("1970-01-10",
> "1970-01-17", "1970-01-24", "1970-01-31", "1970-02-07", "1970-02-14",
> "1970-02-21", "1970-02-28", "1970-03-07"), class = "factor"),
>     value = c(-15.035, -20.478, -22.245, -23.576, -8.84099999999995,
>     -18.497, -13.892, -18.974, -15.919, -13.576)), .Names = c("year",
> "citycode", "month", "week", "date", "tmin"), row.names = c(NA,
> 10L), class = "data.frame")
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list