[R] dist like function but where you can configure the method
William Dunlap
wdunlap at tibco.com
Fri May 16 23:09:47 CEST 2014
>> system.time(apply(t(1:(n*n)),1,myfunc))
> User System verstrichen
> 0.19 0.00 0.19
That calls 'myfunc' exactly once:
> system.time(apply(t(1:(3*3)), 1, print))
[1] 1 2 3 4 5 6 7 8 9
user system elapsed
0 0 0
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Fri, May 16, 2014 at 1:00 PM, Witold E Wolski <wewolski at gmail.com> wrote:
> Ouch,
>
> First : my question was not how to implement dist but if there is a
> more generic dist function than stats:dist.
>
> Secondly: ks.test is ment as a placeholder (see the comment in the
> code I did send) for any other function taking two vector arguments.
>
> Third: I do subscribe to the idea that a function call is easier to
> read and understand than a for loop. @Bert apply is a native C
> function and the loop is not interpreted AFAIK
>
> @Rui @Barry @Jari What do you benchmark? an empty loop?
>
> Look at the trivial benchmarks below: _apply_ clearly outperforms a
> for loop in R , It always has, it outperforms even an empty for
>
> # an empty unrealistic for loop as suggested by Rui , Barry and Jari
> f1 <- function(n){
> for(i in 1:n){
> for(j in 1:n){
> }
> }}
>
>
> myfunc = function(x,y=x){x-y}
>
> # a for loop which does actually something
> f2 <- function(n){
> mm <- matrix(0,ncol=n,nrow=n)
> for(i in 1:n){
> for(j in 1:n){
> mm[i,j] = myfunc(i,j)
> }
> }
> return(mm)
> }
>
> # and array
> f3 = function(n){
> res = rep(0,n*n)
> for(i in 1:(n*n))
> {
> res[i] = myfunc(i)
> }
> }
>
>
> n = 1000
> system.time(f1(n))
> system.time(f2(n))
> system.time(f3(n))
> system.time(apply(t(1:(n*n)),1,myfunc))
>
>
>> system.time(f1(n))
> User System verstrichen
> 0.28 0.00 0.28
>> system.time(f2(n))
> User System verstrichen
> 6.80 0.00 7.09
>> system.time(f3(n))
> User System verstrichen
> 5.83 0.00 5.98
>> system.time(apply(t(1:(n*n)),1,myfunc))
> User System verstrichen
> 0.19 0.00 0.19
>
>
>
>
>
>
> On 16 May 2014 20:55, Rui Barradas <ruipbarradas at sapo.pt> wrote:
>> Hello,
>>
>> The compiler package is good at speeding up for loops but in this case the
>> gain is neglectable. The ks test is the real time problem.
>>
>> library(compiler)
>>
>> f1 <- function(n){
>>
>> for(i in 1:100){
>> for(i in 1:100){
>> ks.test(runif(100),runif(100))
>> }
>> }
>> }
>>
>> f1.c <- cmpfun(f1)
>>
>> system.time(f1())
>> user system elapsed
>> 3.50 0.00 3.53
>> system.time(f1.c())
>> user system elapsed
>> 3.47 0.00 3.48
>>
>>
>> Rui Barradas
>>
>> Em 16-05-2014 17:12, Barry Rowlingson escreveu:
>>>
>>> On Fri, May 16, 2014 at 4:46 PM, Witold E Wolski <wewolski at gmail.com>
>>> wrote:
>>>>
>>>> Dear Jari,
>>>>
>>>> Thanks for your reply...
>>>>
>>>> The overhead would be
>>>> 2 for loops
>>>> for(i in 1:dim(x)[2])
>>>> for(j in i:dim(x)[2])
>>>>
>>>> isn't it? Or are you seeing a different way to implement it?
>>>>
>>>> A for loop is pretty expensive in R. Therefore I am looking for an
>>>> implementation similar to apply or lapply were the iteration is made
>>>> in native code.
>>>
>>>
>>> No, a for loop is not pretty expensive in R -- at least not compared
>>> to doing a k-s test:
>>>
>>> > system.time(for(i in 1:10000){ks.test(runif(100),runif(100))})
>>> user system elapsed
>>> 3.680 0.012 3.697
>>>
>>> 3.68 seconds to do 10000 ks tests (and generate 200 runifs)
>>>
>>> > system.time(for(i in 1:10000){})
>>> user system elapsed
>>> 0.000 0.000 0.001
>>>
>>> 0.000s time to do 10000 loops. Oh lets nest it for fun:
>>>
>>> > system.time(for(i in 1:100){for(i in
>>> 1:100){ks.test(runif(100),runif(100))}})
>>> user system elapsed
>>> 3.692 0.004 3.701
>>>
>>> no different. Even a ks-test with only 5 items is taking me 2.2 seconds.
>>>
>>> Moral: don't worry about the for loops.
>>>
>>> Barry
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>
>
>
> --
> Witold Eryk Wolski
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list