[R] vectorized "leave one out" analyses

ripley@stats.ox.ac.uk ripley at stats.ox.ac.uk
Mon Feb 3 20:42:02 CET 2003

There are jackknife functions about, but this is not jackknifing. Unless
popstat is itself vectorized (meaning I think that it can take list of
datasets, or perhaps a matrix)  I doubt if anything is better than (a).

Remember Jackson's rules of programming (which are quoted in `S 
Programming').  Don't optimize until you need to.

On 3 Feb 2003, Allan Strand wrote:

> Hi all,
> I'm implementing a population genetic statistic that requires repeated
> re-estimation of population parameters after a single observation has
> been left out.  It seems to me that one could:
> a) use looping in R,
> b) use a vectorized approach in R,
> c) loop in a dynamically loaded c-function,
> d) or use an existing jackknife routine.
> an untested skeleton of the code for  'a':
> foo <- function(datfrm)
> {
>   retvec <- rep(0,nrow(datfrm))
>   selvec <- rep(T,nrow(datfrm))
>   for (i in 1:nrow(datfrm))
>     {
>        selvec[i] <- F
>        retvec[i] <- popstat(datfrm[selvec]) 
>        selvec[i] <- T
>     }
>   retvec
> }
> I suppose that 'd' is the easiest option if such a routine exists, but
> I have not come across one by means of an archive search.  I'd like to
> avoid 'a' because of efficiency, and 'c' because of additional coding
> and linking steps.  I like the idea of 'b' because it would be nifty
> and likely fast, though there may be memory issues.  I'm sure that
> this is a general problem that somebody has solved in an elegant
> fashion.  I'm just looking for the solution. 

Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

More information about the R-help mailing list