[R] No speed up using the parallel package and ncpus > 1 with boot() on linux machines
Jeff Newmiller
jdnewmil at dcn.davis.CA.us
Mon Oct 19 09:23:14 CEST 2015
Regarding cores... The only reliable way I have found so far is to look up the processor specs. In your case I found [1] which says 4 cores.
[1] http://ark.intel.com/m/products/64900/Intel-Core-i7-3615QM-Processor-6M-Cache-up-to-3_30-GHz#@product/specifications
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
---------------------------------------------------------------------------
Sent from my phone. Please excuse my brevity.
On October 18, 2015 2:31:13 AM PDT, Chris Evans <chrishold at psyctc.org> wrote:
>As with Milan's answer: perfect explanation and hugely appreciated. A
>few follow up questions/comments below.
>
>----- Original Message -----
>> From: "Jeff Newmiller" <jdnewmil at dcn.davis.ca.us>
>> To: "Chris Evans" <chrishold at psyctc.org>
>> Cc: r-help at r-project.org
>> Sent: Saturday, 17 October, 2015 18:28:12
>> Subject: Re: [R] No speed up using the parallel package and ncpus > 1
>with boot() on linux machines
>
>> None of this is surprising. If the calculations you divide your work
>up
>> into are small, then the overhead of communicating between parallel
>> processes will be a relatively large penalty to pay. You have to
>break
>> your problem up into larger chunks and depend on vector processing
>within
>> processes to keep the cpu busy doing useful work.
>
>Aha. Got it!
>
>> Also, I am not aware of any model of Mac Mini that has 8 physical
>cores...
>> 4 is the max. Virtual cores gain a logical simplification of
>> multiprocessing but do not offer actual improved performance because
>> there are only as many physical data paths and registers as there are
>> cores.
>
>Ah. Hadn't thought of that. It's a machine I rent, I thought it was a
>mac mini. detectCores() reports 8 but perhaps they are virtual cores.
>/proc/cpuinfo says the processor is an Intel(R) Core(TM) i7-3615QM CPU
>@ 2.30GHz and shows 8 cores but again ... perhaps they are virtual.
>What's the best way to get a true core count?
>
>> Note that your problems are with long-running simulations... your
>examples
>> are too small to demonstrate the actual balance of processing vs.
>> communication overhead. Before you draw conclusions, try upping
>bootReps
>> by a few orders of magnitude, and run your test code a couple
>> of times to stabilize the memory conditions and obtain some
>consistency
>> in timings.
>
>OK. Good advice again but what you are saying, and the findings I had
>there, are pretty consistent with what I was seeing with long running
>things with bootReps up at 10k and I think you've told me what I really
>want to know. I think the simplest way to parallelise may actually be
>fine for me: I'll run four (or maybe eight) separate R jobs (having a
>look at swapping to make sure I'm not pushing beyond physical RAM,
>don't think these simulations will.
>
>> I have never used the parallel option in the boot package before... I
>have
>> always rolled my own to allow me to decide how much work to do within
>the
>> worker processes before returning from them. (This is particularly
>severe
>> when using snow, but not necessarily something you can neglect with
>> multicore.)
>
>That sounds like an impressive and obviously pertinent approach. I
>think, as I say, I may be able to get away with a very simple approach
>that runs parallel simulations and then aggregates the data from each
>and analyses that.
>
>Many thanks Jeff. Brilliant help.
>
>Chris
>
>
>> On Sat, 17 Oct 2015, Chris Evans wrote:
>>
>>> I think I am failing to understand how boot() uses the parallel
>package on linux
>
>... rest of my original post deleted to save space ...
>
>
>>
>---------------------------------------------------------------------------
>> Jeff Newmiller The ..... ..... Go
>Live...
>> DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live
>Go...
>> Live: OO#.. Dead: OO#..
>Playing
>> Research Engineer (Solar/Batteries O.O#. #.O#. with
>> /Software/Embedded Controllers) .OO#. .OO#.
>rocks...1k
>>
>---------------------------------------------------------------------------
>
>______________________________________________
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list