[R] in continuation with the earlier R puzzle
Joshua Wiley
jwiley.psych at gmail.com
Mon Jul 12 20:40:11 CEST 2010
I wanted to point out one thing that Ted said, about initializing the
vectors ('s' in your example). This can make a dramatic speed
difference if you are using a for loop (the difference is neglible
with vectorized computations).
Also, a lot of benchmarks have been flying around, each from a
different system and using random numbers without identical seeds. So
to provide an overall comparison of all the methods I saw here plus
demonstrate the speed difference for initializing a vector (if you
know its desired length in advance), I ran these benchmarks.
Notes:
I did not want to interfere with your objects so I used different
names. The equivalencies are: news1o = x; s2o = y; s = z.
system.time() automatically calculates the time difference from
proc.time() between start and finish .
> ##R version info
> sessionInfo()
R version 2.11.1 (2010-05-31)
x86_64-pc-mingw32
#snipped
>
> ##Some Sample Data
> set.seed(10)
> x <- rnorm(10^6)
> set.seed(15)
> y <- rnorm(10^6)
>
> ##Benchmark 1
> z.1 <- NULL
> system.time(for(i in 1:length(x)) {
+ if(x[i] > y[i]) {
+ z.1[i] <- 1
+ } else {
+ z.1[i] <- -1}
+ }
+ )
user system elapsed
1303.83 174.24 1483.74
>
> ##Benchmark 2
> #initialize 'z' at length
> z.2 <- vector("numeric", length = 10^6)
> system.time(for(i in 1:length(x)) {
+ if(x[i] > y[i]) {
+ z.2[i] <- 1
+ } else {
+ z.2[i] <- -1}
+ }
+ )
user system elapsed
3.77 0.00 3.77
>
> ##Benchmark 3
>
> z.3 <- NULL
> system.time(z.3 <- ifelse(x > y, 1, -1))
user system elapsed
0.38 0.00 0.38
>
> ##Benchmark 4
>
> z.4 <- vector("numeric", length = 10^6)
> system.time(z.4 <- ifelse(x > y, 1, -1))
user system elapsed
0.31 0.00 0.31
>
> ##Benchmark 5
>
> system.time(z.5 <- 2*(x > y) - 1)
user system elapsed
0.01 0.00 0.01
>
> ##Benchmark 6
>
> system.time(z.6 <- numeric(length(x))-1)
user system elapsed
0 0 0
> system.time(z.6[x > y] <- 1)
user system elapsed
0.03 0.00 0.03
>
> ##Show that all results are identical
>
> identical(z.1, z.2)
[1] TRUE
> identical(z.1, z.3)
[1] TRUE
> identical(z.1, z.4)
[1] TRUE
> identical(z.1, z.5)
[1] TRUE
> identical(z.1, z.6)
[1] TRUE
I have not replicated these on other system, but tentatively, it
appears that loops are significantly slower than ifelse(), which in
turn is slower than options 5 and 6. However, when using the same
test data and the same system, I did not find an appreciable
difference between options 5 and 6 speed wise.
Cheers,
Josh
On Mon, Jul 12, 2010 at 7:09 AM, Raghu <r.raghuraman at gmail.com> wrote:
> When I just run a for loop it works. But if I am going to run a for loop
> every time for large vectors I might as well use C or any other language.
> The reason R is powerful is becasue it can handle large vectors without each
> element being manipulated? Please let me know where I am wrong.
>
> for(i in 1:length(news1o)){
> + if(news1o[i]>s2o[i])
> + s[i]<-1
> + else
> + s[i]<--1
> + }
>
> --
> 'Raghu'
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/
More information about the R-help
mailing list