[R] Timings of function execution in R [was Re: R in Industry]

Fri Feb 9 10:24:55 CET 2007

>>>>> "Ravi" == Ravi Varadhan <rvaradhan at jhmi.edu>
>>>>>     on Thu, 8 Feb 2007 18:41:38 -0500 writes:

    Ravi> Hi,
    Ravi> "greaterOf" is indeed an interesting function.  It is much faster than the
    Ravi> equivalent R function, "pmax", because pmax does a lot of checking for
    Ravi> missing data and for recycling.  Tom Lumley suggested a simple function to
    Ravi> replace pmax, without these checks, that is analogous to greaterOf, which I
    Ravi> call fast.pmax.  

    Ravi> fast.pmax <- function(x,y) {i<- x<y; x[i]<-y[i]; x}

    Ravi> Interestingly, greaterOf is even faster than fast.pmax, although you have to
    Ravi> be dealing with very large vectors (O(10^6)) to see any real difference.

Yes. Indeed, I have a file, first version dated from 1992
where I explore the "slowness" of pmin() and pmax() (in S-plus
3.2 then). I had since added quite a few experiments and versions to that
file in the past.

As consequence, in the robustbase CRAN package (which is only a bit
more than a year old though), there's a file, available as
  https://svn.r-project.org/R-packages/robustbase/R/Auxiliaries.R
with the very simple content {note line 3 !}:

-------------------------------------------------------------------------
### Fast versions of pmin() and pmax() for 2 arguments only:

### FIXME: should rather add these to R
pmin2 <- function(k,x) (x+k - abs(x-k))/2
pmax2 <- function(k,x) (x+k + abs(x-k))/2
-------------------------------------------------------------------------

{the "funny" argument name 'k' comes from the use of these to
 compute Huber's psi() fast :

  psiHuber <- function(x,k)  pmin2(k, pmax2(- k, x))
  curve(psiHuber(x, 1.35), -3,3, asp = 1)
}

One point *is* that I think proper function names would be pmin2() and
pmax2() since they work with exactly 2 arguments,
whereas IIRC the feature to work with '...' is exactly the
reason that pmax() and pmin() are so much slower.

I've haven't checked if Gabor's 
     pmax2.G <- function(x,y) {z <- x > y; z * (x-y) + y}
is even faster than the abs() using one.
It may have the advantage of giving *identical* results (to the
last bit!)  to pmax()  which my version does not --- IIRC the
only reason I did not follow my own 'FIXME' above.

I  had then planned to implement pmin2() and pmax2() in C code, trivially,
and and hence get identical (to the last bit!) behavior as
pmin()/pmax(); but I now tend to think that the proper approach is to
code pmin() and pmax() via .Internal() and hence C code ...

[Not before DSC and my vacations though!!]

Martin Maechler, ETH Zurich