[R] Why is mclappy slower than apply in this case?

Bert Gunter gunter.berton at gene.com
Thu Aug 8 18:52:56 CEST 2013


Tomas:

Do some reading on parallelization.

Parallelizing code requires the overhead of setting up, keeping track
of, synching the separate threads. Whether that overhead is worth the
cost depends on the problem,the size,  the algorithms, the
machines/hardware,...

Cheers,
Bert

On Thu, Aug 8, 2013 at 4:00 AM, Tomas Reigl <incivile at seznam.cz> wrote:
>
>
> Hello,
>
>
> i'm pretty confused. I want to speed up my algorithm by using mclapply:
> parallel, but when I compare time efficiency, apply still wins.
>
> I'm smoothing log2ratio data by rq.fit.fnb:quantreg which is called by my
> function quantsm and I'm wrapping my data into matrix/list for apply/lapply
> (mclapply) usage.
>
>
>
>
> I adjust my data like this:
>
> <code><span class='pln'>q </span><span class='pun'>=</span><span class='pln'>
> matrix</span><span class='pun'>(</span><span class='pln'>data</span><span
> class='pun'>,</span><span class='pln'> ncol</span><span
> class='pun'>=</span><span class='pln'>N</span><span class='pun'>)</span><span
> class='pln'>        </span><span class='com'># wrapping into matrix (using N =
>  2, 4, 6 or 8)</span><span class='pln'>
> ql </span><span class='pun'>=</span><span class='pln'> </span><span
> class='kwd'>as</span><span class='pun'>.</span><span
> class='pln'>list</span><span class='pun'>(</span><span
> class='kwd'>as</span><span class='pun'>.</span><span
> class='pln'>data</span><span class='pun'>.</span><span
> class='pln'>frame</span><span class='pun'>(</span><span
> class='pln'>q</span><span class='pun'>))</span><span class='pln'>
> </span><span class='com'># making list</span></code>
>
> And time comparing:
>
> <code><span class='pln'>apply</span><span class='pun'>=</span><span
> class='pln'>system</span><span class='pun'>.</span><span
> class='pln'>time</span><span class='pun'>(</span><span
> class='pln'>apply</span><span class='pun'>(</span><span
> class='pln'>q</span><span class='pun'>,</span><span class='pln'> </span><span
> class='lit'>1</span><span class='pun'>,</span><span class='pln'>
> FUN</span><span class='pun'>=</span><span class='pln'>quantsm</span><span
> class='pun'>,</span><span class='pln'> </span><span
> class='lit'>0.50</span><span class='pun'>,</span><span class='pln'>
> </span><span class='lit'>2</span><span class='pun'>))</span><span class='pln'>
> lapply</span><span class='pun'>=</span><span class='pln'>system</span><span
> class='pun'>.</span><span class='pln'>time</span><span
> class='pun'>(</span><span class='pln'>lapply</span><span
> class='pun'>(</span><span class='pln'>ql</span><span class='pun'>,</span><span
>  class='pln'> FUN</span><span class='pun'>=</span><span
> class='pln'>quantsm</span><span class='pun'>,</span><span class='pln'>
> </span><span class='lit'>0.50</span><span class='pun'>,</span><span
> class='pln'> </span><span class='lit'>2</span><span class='pun'>))</span><span
>  class='pln'>
> mc2lapply</span><span class='pun'>=</span><span class='pln'>system</span><span
>  class='pun'>.</span><span class='pln'>time</span><span
> class='pun'>(</span><span class='pln'>mclapply</span><span
> class='pun'>(</span><span class='pln'>ql</span><span class='pun'>,</span><span
>  class='pln'> FUN</span><span class='pun'>=</span><span
> class='pln'>quantsm</span><span class='pun'>,</span><span class='pln'>
> </span><span class='lit'>0.50</span><span class='pun'>,</span><span
> class='pln'> </span><span class='lit'>2</span><span class='pun'>,</span><span
> class='pln'> mc</span><span class='pun'>.</span><span
> class='pln'>cores</span><span class='pun'>=</span><span
> class='lit'>2</span><span class='pun'>))</span><span class='pln'>
> mc4lapply</span><span class='pun'>=</span><span class='pln'>system</span><span
>  class='pun'>.</span><span class='pln'>time</span><span
> class='pun'>(</span><span class='pln'>mclapply</span><span
> class='pun'>(</span><span class='pln'>ql</span><span class='pun'>,</span><span
>  class='pln'> FUN</span><span class='pun'>=</span><span
> class='pln'>quantsm</span><span class='pun'>,</span><span class='pln'>
> </span><span class='lit'>0.50</span><span class='pun'>,</span><span
> class='pln'> </span><span class='lit'>2</span><span class='pun'>,</span><span
> class='pln'> mc</span><span class='pun'>.</span><span
> class='pln'>cores</span><span class='pun'>=</span><span
> class='lit'>4</span><span class='pun'>))</span><span class='pln'>
> mc6lapply</span><span class='pun'>=</span><span class='pln'>system</span><span
>  class='pun'>.</span><span class='pln'>time</span><span
> class='pun'>(</span><span class='pln'>mclapply</span><span
> class='pun'>(</span><span class='pln'>ql</span><span class='pun'>,</span><span
>  class='pln'> FUN</span><span class='pun'>=</span><span
> class='pln'>quantsm</span><span class='pun'>,</span><span class='pln'>
> </span><span class='lit'>0.50</span><span class='pun'>,</span><span
> class='pln'> </span><span class='lit'>2</span><span class='pun'>,</span><span
> class='pln'> mc</span><span class='pun'>.</span><span
> class='pln'>cores</span><span class='pun'>=</span><span
> class='lit'>6</span><span class='pun'>))</span><span class='pln'>
> mc8lapply</span><span class='pun'>=</span><span class='pln'>system</span><span
>  class='pun'>.</span><span class='pln'>time</span><span
> class='pun'>(</span><span class='pln'>mclapply</span><span
> class='pun'>(</span><span class='pln'>ql</span><span class='pun'>,</span><span
>  class='pln'> FUN</span><span class='pun'>=</span><span
> class='pln'>quantsm</span><span class='pun'>,</span><span class='pln'>
> </span><span class='lit'>0.50</span><span class='pun'>,</span><span
> class='pln'> </span><span class='lit'>2</span><span class='pun'>,</span><span
> class='pln'> mc</span><span class='pun'>.</span><span
> class='pln'>cores</span><span class='pun'>=</span><span
> class='lit'>8</span><span class='pun'>))</span><span class='pln'>
> timing</span><span class='pun'>=</span><span class='pln'>rbind</span><span
> class='pun'>(</span><span class='pln'>apply</span><span
> class='pun'>,</span><span class='pln'>lapply</span><span
> class='pun'>,</span><span class='pln'>mc2lapply</span><span
> class='pun'>,</span><span class='pln'>mc4lapply</span><span
> class='pun'>,</span><span class='pln'>mc6lapply</span><span
> class='pun'>,</span><span class='pln'>mc8lapply</span><span
> class='pun'>)</span></code>
>
> Function quantsm:
>
> <code><span class='pln'>quantsm </span><span class='pun'><-</span><span
> class='pln'> </span><span class='kwd'>function</span><span class='pln'>
> </span><span class='pun'>(</span><span class='pln'>y</span><span
> class='pun'>,</span><span class='pln'> p </span><span
> class='pun'>=</span><span class='pln'> </span><span
> class='lit'>0.5</span><span class='pun'>,</span><span class='pln'>
> </span><span class='kwd'>lambda</span><span class='pun'>)</span><span
> class='pln'> </span><span class='pun'>{</span><span class='pln'>
>    </span><span class='com'># Quantile smoothing</span><span class='pln'>
>    </span><span class='com'># Input: response y, quantile level p (0<p<1),
> smoothing parmeter lambda</span><span class='pln'>
>    </span><span class='com'># Result: quantile curve</span><span class='pln'>
>
>    </span><span class='com'># Augment the data for the difference
> penalty</span><span class='pln'>
>    m </span><span class='pun'><-</span><span class='pln'> length</span><span
> class='pun'>(</span><span class='pln'>y</span><span class='pun'>)</span><span
> class='pln'>
>    E </span><span class='pun'><-</span><span class='pln'> diag</span><span
> class='pun'>(</span><span class='pln'>m</span><span class='pun'>);</span><span
>  class='pln'>
>    </span><span class='typ'>Dmat</span><span class='pln'> </span><span
> class='pun'><-</span><span class='pln'> diff</span><span
> class='pun'>(</span><span class='pln'>E</span><span class='pun'>);</span><span
>  class='pln'>
>    X </span><span class='pun'><-</span><span class='pln'> rbind</span><span
> class='pun'>(</span><span class='pln'>E</span><span class='pun'>,</span><span
> class='pln'> </span><span class='kwd'>lambda</span><span class='pln'>
> </span><span class='pun'>*</span><span class='pln'> </span><span
> class='typ'>Dmat</span><span class='pun'>)</span><span class='pln'>
>    u </span><span class='pun'><-</span><span class='pln'> c</span><span
> class='pun'>(</span><span class='pln'>y</span><span class='pun'>,</span><span
> class='pln'> rep</span><span class='pun'>(</span><span
> class='lit'>0</span><span class='pun'>,</span><span class='pln'> m
> </span><span class='pun'>-</span><span class='pln'> </span><span
> class='lit'>1</span><span class='pun'>))</span><span class='pln'>
>
>    </span><span class='com'># Call quantile regression</span><span
> class='pln'>
>    q </span><span class='pun'><-</span><span class='pln'> rq</span><span
> class='pun'>.</span><span class='pln'>fit</span><span
> class='pun'>.</span><span class='pln'>fnb</span><span
> class='pun'>(</span><span class='pln'>X</span><span class='pun'>,</span><span
> class='pln'> u</span><span class='pun'>,</span><span class='pln'> tau
> </span><span class='pun'>=</span><span class='pln'> p</span><span
> class='pun'>)</span><span class='pln'>
>    q
> </span><span class='pun'>}</span></code>
>
> Function rq.fit.fnb (quantreg library):
>
> <code><span class='pln'>rq</span><span class='pun'>.</span><span
> class='pln'>fit</span><span class='pun'>.</span><span class='pln'>fnb
> </span><span class='pun'><-</span><span class='pln'> </span><span
> class='kwd'>function</span><span class='pln'> </span><span
> class='pun'>(</span><span class='pln'>x</span><span class='pun'>,</span><span
> class='pln'> y</span><span class='pun'>,</span><span class='pln'> tau
> </span><span class='pun'>=</span><span class='pln'> </span><span
> class='lit'>0.5</span><span class='pun'>,</span><span class='pln'> beta
> </span><span class='pun'>=</span><span class='pln'> </span><span
> class='lit'>0.99995</span><span class='pun'>,</span><span class='pln'> eps
> </span><span class='pun'>=</span><span class='pln'> </span><span
> class='lit'>1e-06</span><span class='pun'>)</span><span class='pln'>
> </span><span class='pun'>{</span><span class='pln'>
>     n </span><span class='pun'><-</span><span class='pln'> length</span><span
> class='pun'>(</span><span class='pln'>y</span><span class='pun'>)</span><span
> class='pln'>
>     p </span><span class='pun'><-</span><span class='pln'> ncol</span><span
> class='pun'>(</span><span class='pln'>x</span><span class='pun'>)</span><span
> class='pln'>
>     </span><span class='kwd'>if</span><span class='pln'> </span><span
> class='pun'>(</span><span class='pln'>n </span><span
> class='pun'>!=</span><span class='pln'> nrow</span><span
> class='pun'>(</span><span class='pln'>x</span><span class='pun'>))</span><span
>  class='pln'>
>         stop</span><span class='pun'>(</span><span class='str'>"x and y don't
> match n"</span><span class='pun'>)</span><span class='pln'>
>     </span><span class='kwd'>if</span><span class='pln'> </span><span
> class='pun'>(</span><span class='pln'>tau </span><span
> class='pun'><</span><span class='pln'> eps </span><span
> class='pun'>||</span><span class='pln'> tau </span><span
> class='pun'>></span><span class='pln'> </span><span class='lit'>1</span><span
> class='pln'> </span><span class='pun'>-</span><span class='pln'>
> eps</span><span class='pun'>)</span><span class='pln'>
>         stop</span><span class='pun'>(</span><span class='str'>"No parametric
> Frisch-Newton method.  Set tau in (0,1)"</span><span class='pun'>)</span><span
>  class='pln'>
>     rhs </span><span class='pun'><-</span><span class='pln'> </span><span
> class='pun'>(</span><span class='lit'>1</span><span class='pln'> </span><span
> class='pun'>-</span><span class='pln'> tau</span><span
> class='pun'>)</span><span class='pln'> </span><span class='pun'>*</span><span
> class='pln'> apply</span><span class='pun'>(</span><span
> class='pln'>x</span><span class='pun'>,</span><span class='pln'> </span><span
> class='lit'>2</span><span class='pun'>,</span><span class='pln'>
> sum</span><span class='pun'>)</span><span class='pln'>
>     d </span><span class='pun'><-</span><span class='pln'> rep</span><span
> class='pun'>(</span><span class='lit'>1</span><span class='pun'>,</span><span
> class='pln'> n</span><span class='pun'>)</span><span class='pln'>
>     u </span><span class='pun'><-</span><span class='pln'> rep</span><span
> class='pun'>(</span><span class='lit'>1</span><span class='pun'>,</span><span
> class='pln'> n</span><span class='pun'>)</span><span class='pln'>
>     wn </span><span class='pun'><-</span><span class='pln'> rep</span><span
> class='pun'>(</span><span class='lit'>0</span><span class='pun'>,</span><span
> class='pln'> </span><span class='lit'>10</span><span class='pln'> </span><span
>  class='pun'>*</span><span class='pln'> n</span><span
> class='pun'>)</span><span class='pln'>
>     wn</span><span class='pun'>[</span><span class='lit'>1</span><span
> class='pun'>:</span><span class='pln'>n</span><span class='pun'>]</span><span
> class='pln'> </span><span class='pun'><-</span><span class='pln'> </span><span
>  class='pun'>(</span><span class='lit'>1</span><span class='pln'> </span><span
>  class='pun'>-</span><span class='pln'> tau</span><span
> class='pun'>)</span><span class='pln'>
>     z </span><span class='pun'><-</span><span class='pln'> </span><span
> class='pun'>.</span><span class='typ'>Fortran</span><span
> class='pun'>(</span><span class='str'>"rqfnb"</span><span
> class='pun'>,</span><span class='pln'> </span><span class='kwd'>as</span><span
>  class='pun'>.</span><span class='pln'>integer</span><span
> class='pun'>(</span><span class='pln'>n</span><span class='pun'>),</span><span
>  class='pln'> </span><span class='kwd'>as</span><span
> class='pun'>.</span><span class='pln'>integer</span><span
> class='pun'>(</span><span class='pln'>p</span><span class='pun'>),</span><span
>  class='pln'> a </span><span class='pun'>=</span><span class='pln'>
> </span><span class='kwd'>as</span><span class='pun'>.</span><span
> class='kwd'>double</span><span class='pun'>(</span><span
> class='pln'>t</span><span class='pun'>(</span><span class='kwd'>as</span><span
>  class='pun'>.</span><span class='pln'>matrix</span><span
> class='pun'>(</span><span class='pln'>x</span><span
> class='pun'>))),</span><span class='pln'>
>         c </span><span class='pun'>=</span><span class='pln'> </span><span
> class='kwd'>as</span><span class='pun'>.</span><span
> class='kwd'>double</span><span class='pun'>(-</span><span
> class='pln'>y</span><span class='pun'>),</span><span class='pln'> rhs
> </span><span class='pun'>=</span><span class='pln'> </span><span
> class='kwd'>as</span><span class='pun'>.</span><span
> class='kwd'>double</span><span class='pun'>(</span><span
> class='pln'>rhs</span><span class='pun'>),</span><span class='pln'> d
> </span><span class='pun'>=</span><span class='pln'> </span><span
> class='kwd'>as</span><span class='pun'>.</span><span
> class='kwd'>double</span><span class='pun'>(</span><span
> class='pln'>d</span><span class='pun'>),</span><span class='pln'>
>         </span><span class='kwd'>as</span><span class='pun'>.</span><span
> class='kwd'>double</span><span class='pun'>(</span><span
> class='pln'>u</span><span class='pun'>),</span><span class='pln'> beta
> </span><span class='pun'>=</span><span class='pln'> </span><span
> class='kwd'>as</span><span class='pun'>.</span><span
> class='kwd'>double</span><span class='pun'>(</span><span
> class='pln'>beta</span><span class='pun'>),</span><span class='pln'> eps
> </span><span class='pun'>=</span><span class='pln'> </span><span
> class='kwd'>as</span><span class='pun'>.</span><span
> class='kwd'>double</span><span class='pun'>(</span><span
> class='pln'>eps</span><span class='pun'>),</span><span class='pln'>
>         wn </span><span class='pun'>=</span><span class='pln'> </span><span
> class='kwd'>as</span><span class='pun'>.</span><span
> class='kwd'>double</span><span class='pun'>(</span><span
> class='pln'>wn</span><span class='pun'>),</span><span class='pln'> wp
> </span><span class='pun'>=</span><span class='pln'> </span><span
> class='kwd'>double</span><span class='pun'>((</span><span class='pln'>p
> </span><span class='pun'>+</span><span class='pln'> </span><span
> class='lit'>3</span><span class='pun'>)</span><span class='pln'> </span><span
> class='pun'>*</span><span class='pln'> p</span><span
> class='pun'>),</span><span class='pln'> it</span><span
> class='pun'>.</span><span class='pln'>count </span><span
> class='pun'>=</span><span class='pln'> integer</span><span
> class='pun'>(</span><span class='lit'>3</span><span class='pun'>),</span><span
>  class='pln'>
>         info </span><span class='pun'>=</span><span class='pln'>
> integer</span><span class='pun'>(</span><span class='lit'>1</span><span
> class='pun'>),</span><span class='pln'> PACKAGE </span><span
> class='pun'>=</span><span class='pln'> </span><span
> class='str'>"quantreg"</span><span class='pun'>)</span><span class='pln'>
>     coefficients </span><span class='pun'><-</span><span class='pln'>
> </span><span class='pun'>-</span><span class='pln'>z$wp</span><span
> class='pun'>[</span><span class='lit'>1</span><span class='pun'>:</span><span
> class='pln'>p</span><span class='pun'>]</span><span class='pln'>
>     names</span><span class='pun'>(</span><span
> class='pln'>coefficients</span><span class='pun'>)</span><span class='pln'>
> </span><span class='pun'><-</span><span class='pln'> dimnames</span><span
> class='pun'>(</span><span class='pln'>x</span><span
> class='pun'>)[[</span><span class='lit'>2</span><span
> class='pun'>]]</span><span class='pln'>
>     residuals </span><span class='pun'><-</span><span class='pln'> y
> </span><span class='pun'>-</span><span class='pln'> x </span><span
> class='pun'>%*%</span><span class='pln'> coefficients
>     list</span><span class='pun'>(</span><span class='pln'>coefficients
> </span><span class='pun'>=</span><span class='pln'> coefficients</span><span
> class='pun'>,</span><span class='pln'> tau </span><span
> class='pun'>=</span><span class='pln'> tau</span><span
> class='pun'>,</span><span class='pln'> residuals </span><span
> class='pun'>=</span><span class='pln'> residuals</span><span
> class='pun'>)</span><span class='pln'>
> </span><span class='pun'>}</span></code>
>
> For data vector of length 2000 i get:
>
> (value = elapsed time in sec; columns = different number of columns of
> smoothed matrix/list)
>
> <code><span class='pln'>           </span><span class='lit'>2cols</span><span
> class='pln'> </span><span class='lit'>4cols</span><span class='pln'>
> </span><span class='lit'>6cols</span><span class='pln'> </span><span
> class='lit'>8cols</span><span class='pln'>
> apply      </span><span class='lit'>0.178</span><span class='pln'>
> </span><span class='lit'>0.096</span><span class='pln'> </span><span
> class='lit'>0.069</span><span class='pln'> </span><span
> class='lit'>0.056</span><span class='pln'>
> lapply    </span><span class='lit'>16.555</span><span class='pln'>
> </span><span class='lit'>4.299</span><span class='pln'> </span><span
> class='lit'>1.785</span><span class='pln'> </span><span
> class='lit'>0.972</span><span class='pln'>
> mc2lapply </span><span class='lit'>11.192</span><span class='pln'>
> </span><span class='lit'>2.089</span><span class='pln'> </span><span
> class='lit'>0.927</span><span class='pln'> </span><span
> class='lit'>0.545</span><span class='pln'>
> mc4lapply </span><span class='lit'>10.649</span><span class='pln'>
> </span><span class='lit'>1.326</span><span class='pln'> </span><span
> class='lit'>0.694</span><span class='pln'> </span><span
> class='lit'>0.396</span><span class='pln'>
> mc6lapply </span><span class='lit'>11.271</span><span class='pln'>
> </span><span class='lit'>1.384</span><span class='pln'> </span><span
> class='lit'>0.528</span><span class='pln'> </span><span
> class='lit'>0.320</span><span class='pln'>
> mc8lapply </span><span class='lit'>10.133</span><span class='pln'>
> </span><span class='lit'>1.390</span><span class='pln'> </span><span
> class='lit'>0.560</span><span class='pln'> </span><span
> class='lit'>0.260</span></code>
>
> For data of length 4000 i get:
>
> <code><span class='pln'>            </span><span class='lit'>2cols</span><span
>  class='pln'>  </span><span class='lit'>4cols</span><span class='pln'>
> </span><span class='lit'>6cols</span><span class='pln'> </span><span
> class='lit'>8cols</span><span class='pln'>
> apply       </span><span class='lit'>0.351</span><span class='pln'>
> </span><span class='lit'>0.187</span><span class='pln'>  </span><span
> class='lit'>0.137</span><span class='pln'> </span><span
> class='lit'>0.110</span><span class='pln'>
> lapply    </span><span class='lit'>189.339</span><span class='pln'>
> </span><span class='lit'>32.654</span><span class='pln'> </span><span
> class='lit'>14.544</span><span class='pln'> </span><span
> class='lit'>8.674</span><span class='pln'>
> mc2lapply </span><span class='lit'>186.047</span><span class='pln'>
> </span><span class='lit'>20.791</span><span class='pln'>  </span><span
> class='lit'>7.261</span><span class='pln'> </span><span
> class='lit'>4.231</span><span class='pln'>
> mc4lapply </span><span class='lit'>185.382</span><span class='pln'>
> </span><span class='lit'>30.286</span><span class='pln'>  </span><span
> class='lit'>5.767</span><span class='pln'> </span><span
> class='lit'>2.397</span><span class='pln'>
> mc6lapply </span><span class='lit'>184.048</span><span class='pln'>
> </span><span class='lit'>30.170</span><span class='pln'>  </span><span
> class='lit'>8.059</span><span class='pln'> </span><span
> class='lit'>2.865</span><span class='pln'>
> mc8lapply </span><span class='lit'>182.611</span><span class='pln'>
> </span><span class='lit'>37.617</span><span class='pln'>  </span><span
> class='lit'>7.408</span><span class='pln'> </span><span
> class='lit'>2.842</span></code>
>
> Why is apply so much more efficient than mclapply? Maybe I'm just doing some
> usual beginner mistake.
>
> Thank you for your reactions.
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm



More information about the R-help mailing list