[R] vector angle

Prof Brian Ripley ripley at stats.ox.ac.uk
Tue Jul 17 11:44:28 CEST 2001

On Tue, 17 Jul 2001, Laurent Gautier wrote:

> Prof Brian Ripley wrote:
> > On Tue, 17 Jul 2001, Laurent Gautier wrote:
> >
> > > Evan Zane Macosko wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > I'm translating into R some programs I worked through in Matlab to
> > > > calculate the angle between two vectors (very large--like 6200 rows in
> > > > each vector).  In Matlab, I used a series of nested for loops, because I
> > > > was calculating the angles between many pairs of vectors.  I know for
> > > > loops are not desirable in R code, so I was wondering if anyone could
> > > > recommend a faster way to complete this task.  Also, I have NAs in my
> > > > vectors--I've had trouble performing various operations on my vectors in R
> > > > because of these NAs.
> > > >
> > > > Any advice on this would be greatly appreciated.
> > >
> > > As far as I know, the use of apply (sapply and lapply) would make things run
> > > faster than 'for' loops.
> >
> > Not very much faster in R (and apply itself is basically a for loop).
> > Because for loops were slow in S3, the message seems to have got
> > transferred to S4 and R.  Often the best approach is to see if a loop is
> > fast enough, first.  (In S-PLUS 5.0 lapply was actually slower than a for
> > loop.)
> >
> And I lived all these years in ignorance, thinking I was doing good while I was
> making things worse.....
> I haven't look at R introduction manuals for a while now, so may be the following
> remark is already stated in them. but by the time I started with R, the modern
> statistics with S-plus was the main reference and being a poooor student at that
> time, I learned things through the internet. It prooves to be a bad thing, since
> at that time there was a rumour about these functions being faster than the loops
> (like the 'map' function is told to be faster than the for loop in Python for
> example).
> I just looked at what would get as an answer using a webcrawler, and the rumour
> seems to be still alive
> (see http://www.math.yorku.ca/Who/Faculty/Monette/S-news/2531.html , at the
> bottom of the page, or
> http://www.usc.edu/isd/doc/statistics/splus/faq/v5/newinv5.shtml ), but I would
> have followed better was has been told on this list I would have known it was not
> the case
> (see http://www.ens.gu.edu.au/robertk/R/help/00a/1999.html).
> Thanks for pointing out the mistake I made,

Not a mistake, but a misconception perhaps.

Bill Venables and I deliberately illustrated a number of approaches in
Chapter 7 of S Programming using S+3.4, S+2000, S+5.1 and R 0.90.1 to show
that (p. 152)

  Since our aim is not to compare systems, the timings here using
  different engines were done on different systems, all of which had
  ample RAM.  It is worth noting that the different S implementations
  used here do differ, sometimes radically, in their ordering of
  approaches, and the ordering might be different again on machines with
  less RAM available.

One example (using a for loop) shows a 17x speed up in S+5.1, and a 50%
speed-up in R.

Also, later versions of R show some differences, especially due to the
new memory management system (and lapply has been altered since those
timings too).

Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list