[R] (performance) time in Windows vs Linux
Prof Brian Ripley
ripley at stats.ox.ac.uk
Mon Jun 29 09:59:24 CEST 2009
On Mon, 29 Jun 2009, Raymond Wan wrote:
> milton ruser wrote:
>> In fact I have a quadcore. But how can I know if Linux are really
>> using only one core, and how can I setup it to use the 4cores?
>
> I don't know the answer in the context of R -- I didn't know that R can use
> multiple cores by default?
It cannot, and much of this thread is pure speculation. So let's try
to set the record straight (as we have already done in the manuals).
The only way that a single R process will be using more than one CPU
is if you have added a mulithreaded BLAS (and I've never heard of one
being used successfully with R for Windows) or other add-on such as
Luke Tierney's pmath[0] packages. Packages such as snow and multicore
run multiple R processes.
I do run a multithreaded BLAS on my 8-core Linux box and do often see
'top' well over 100% -- I just tested and saw 798.9%.
It is exceptional to see R under Windows running faster than a
well-tuned R under Linux on the same hardware (and my only Windows
machine is a multiboot that normally runs Linux, so I do have
extensive experience). There are a number of reasons
- R for Windows always uses a shared library, whereas under Linux by
default it does not, for speed -- see the R-admin manual.
- MinGW until recently had only an older compiler, 4.2.1. (gcc 4.4.0
for mingw is just out, but I have not tried it). gcc 4.3.x has both
better general optimizations and better support for the Core 2 Duo my
machine has.
- You can tune the Linux version better by compiling yourself
(although some tuning is possible on Windows).
- Linux uses interrupts for things that Windows polls (or for some
instances R does on those platforms). That includes the overhead on
Windows of running Rgui (if you are using that rather than Rterm) and
polling the Windows message system.
- 32-bit Linux allows access to more address space than 32-bit
Windows, so there may be less frequent garbage collections on large
tasks. In any case, the Linux memory manager is more efficient.
Against that, a 64-bit build will in general be slower than a 32-bit
one -- see the R-admin manual. If you run 32-bit R for Windows on
64-bit Windows you are running under a WOW subsystem and that has a
small overhead: but in our tests the REvolution 64-bit build of R was
slightly slower.
But we are only talking about small differences, say up to 20% and
usually more like 5-10%.
It is usually possible to find some task that a particular compiler
optimizes badly, so there will be rare exceptions.
> But in general, I use "htop", whose man pages
> describes it as: "This program is a free (GPL) ncurses-based process
> viewer."
>
> It is a colored version of "top", essentially. At the top of the screen, you
> will see your 4 cores represented as percentages. Under Setup, add
> "Processor" to the list of options and then "CPU" will appear as a column,
> which if you have 4 cores, the values will vary from 1 to 4.
>
> If you want to check if R is running on more than one core, then obviously R
> should appear more than once and with two different values under CPU.
Not so: that will happen if multiple copies of R are running, not if a
single copy of R is running multiple threads.
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list