[R] slowness when I use a list comprehension

Laurent Rhelp |@urentRHe|p @end|ng |rom |ree@|r
Sun Jun 16 21:29:59 CEST 2024


Thank you for this solution which is faster than the two for loops.

gloop <- function(N1,N2,ratio_sampling,vec1,vec2){

   ix <- seq_along(vec2)
   S_diff2 <- sapply(seq_len(N1-(N2-1)*ratio_sampling), \(j)
                     sum((vec1[(ix-1)*ratio_sampling+j] - vec2[ix])**2))
   return(S_diff2)
}

microbenchmark(
   S_diff2 <- dloop( N1, N2, ratio_sampling, vec1, vec2 )
   , S_diff3 <- vloop( N1, N2, ratio_sampling, vec1, vec2 )
   , S_diff4 <- gloop( N1, N2, ratio_sampling, vec1, vec2)
   , times = 20
)

Unit: milliseconds expr min lq mean median S_diff2 <- dloop(N1, N2, 
ratio_sampling, vec1, vec2) 200.1107 218.10100 230.36871 222.3080 
S_diff3 <- vloop(N1, N2, ratio_sampling, vec1, vec2) 42.6878 46.65425 
73.83425 58.6626 S_diff4 <- gloop(N1, N2, ratio_sampling, vec1, vec2) 
80.4895 91.24735 133.74064 110.9142 uq max neval cld 228.7683 303.8214 
20 a 105.4257 145.2059 20 b 166.4094 233.7112 20 c


Le 16/06/2024 à 20:54, Gabor Grothendieck a écrit :
> This can be vectorized.  Try
>
> ix <- seq_along(vec2)
> S_diff2 <- sapply(seq_len(N1-(N2-1)*ratio_sampling), \(j)
> sum((vec1[(ix-1)*ratio_sampling+j] - vec2[ix])**2))
>
> On Sun, Jun 16, 2024 at 11:27 AM Laurent Rhelp<laurentRHelp using free.fr>  wrote:
>> Dear RHelp-list,
>>
>>      I try to use the package comprehenr to replace a for loop by a list
>> comprehension.
>>
>>    I wrote the code but I certainly miss something because it is very
>> slower compared to the for loops. May you please explain to me why the
>> list comprehension is slower in my case.
>>
>> Here is my example. I do the calculation of the square difference
>> between the values of two vectors vec1 and vec2, the ratio sampling
>> between vec1 and vec2 is equal to ratio_sampling. I have to use only the
>> 500th value of the first serie before doing the difference with the
>> value of the second serie (vec2).
>>
>> Thank you
>>
>> Best regards
>>
>> Laurent
>>
>> library(tictoc)
>> library(comprehenr)
>>
>> ratio_sampling <- 500
>> ## size of the first serie
>> N1 <- 70000
>> ## size of the second serie
>> N2 <- 100
>> ## mock data
>> set.seed(123)
>> vec1 <- rnorm(N1)
>> vec2 <- runif(N2)
>>
>>
>> ## 1. with the "for" loops
>>
>> ## the square differences will be stored in a vector
>> S_diff2 <- numeric((N1-(N2-1)*ratio_sampling))
>> tic()
>> for( j in 1:length(S_diff2)){
>>     sum_squares <- 0
>>     for( i in 1:length(vec2)){
>>       sum_squares = sum_squares + ((vec1[(i-1)*ratio_sampling+j] -
>> vec2[i])**2)
>>     }
>>     S_diff2[j] <- sum_squares
>> }
>> toc()
>> ## 0.22 sec elapsed
>> which.max(S_diff2)
>> ## 7857
>>
>> ## 2. with the lists comprehension
>> tic()
>> S_diff2 <- to_vec(for( j in 1:length(S_diff2)) sum(to_vec(for( i in
>> 1:length(vec2)) ((vec1[(i-1)*ratio_sampling+j] - vec2[i])**2))))
>> toc()
>> ## 25.09 sec elapsed
>> which.max(S_diff2)
>> ## 7857
>>
>> ______________________________________________
>> R-help using r-project.org  mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
	[[alternative HTML version deleted]]



More information about the R-help mailing list