[R] inefficient for loop, is there a better way?
Yvan Richard
yvan at dragonfly.co.nz
Wed Dec 13 03:26:23 CET 2017
One way of doing it with data.table. It seems to scale up pretty well.
It takes 4 seconds on my computer with ts <- 1:1e6.
library(data.table)
per <- 7
elev1 <- 0.6
elev2 <- 0.85
ts <- 1:1000
examp <- data.table(ts=ts, stage=sin(ts))
examp[, `:=`(days_abv_0.6_in_last_7 = apply(do.call('cbind',
shift(stage, 1:per)), 1, function(x) sum(x > elev1)),
days_abv_0.85_in_last_7 = apply(do.call('cbind',
shift(stage, 1:per)), 1, function(x) sum(x > elev2)))]
On 13 December 2017 at 14:36, Morway, Eric <emorway at usgs.gov> wrote:
> The code below is a small reproducible example of a much larger problem.
> While the script below works, it is really slow on the true dataset with
> many more rows and columns. I'm hoping to get the same result to examp,
> but with significant time savings.
>
> The example below is setting up a data.frame for an ensuing regression
> analysis. The purpose of the script below is to appends columns to 'examp'
> that contain values corresponding to the total number of days in the
> previous 7 ('per') above some stage ('elev1' or 'elev2'). Is there a
> faster method that leverages existing R functionality? I feel like the
> hack below is pretty clunky and can be sped up on the true dataset. I
> would like to run a more efficient script many times adjusting the value of
> 'per'.
>
> ts <- 1:1000
> examp <- data.frame(ts=ts, stage=sin(ts))
>
> hi1 <- list()
> hi2 <- list()
> per <- 7
> elev1 <- 0.6
> elev2 <- 0.85
> for(i in per:nrow(examp)){
> examp_per <- examp[seq(i - (per - 1), i, by=1),]
> stg_hi_cond1 <- subset(examp_per, examp_per$stage > elev1)
> stg_hi_cond2 <- subset(examp_per, examp_per$stage > elev2)
>
> hi1 <- c(hi1, nrow(stg_hi_cond1))
> hi2 <- c(hi2, nrow(stg_hi_cond2))
> }
> examp$days_abv_0.6_in_last_7 <- c(rep(NA, times=per-1), unlist(hi1))
> examp$days_abv_0.85_in_last_7 <- c(rep(NA, times=per-1), unlist(hi2))
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Yvan Richard, PhD
Environmental data scientist
Physical address: Level 4, 158 Victoria St, Te Aro, Wellington, New Zealand
Postal address: PO Box 27535, Wellington 6141, New Zealand
Phone: 022 643 7881
More information about the R-help
mailing list