[Rd] SUGGESTION: Settings to disable forked processing in R, e.g. parallel::mclapply()
Henrik Bengtsson
henr|k@bengt@@on @end|ng |rom gm@||@com
Thu Apr 11 22:06:47 CEST 2019
ISSUE:
Using *forks* for parallel processing in R is not always safe. The
`parallel::mclapply()` function uses forked processes to parallelize.
One example where it has been confirmed that forked processing causes
problems is when running R via RStudio. It is recommended to use
PSOCK clusters (`parallel::makeCluster()`) rather than *forked*
processes when running R from RStudio (
https://github.com/rstudio/rstudio/issues/2597#issuecomment-482187011).
AFAIK, it is not straightforward to disable forked processing in R.
One could set environment variable `MC_CORES=1` which will set R
option `mc.cores=1` when the parallel package is loaded. Since
`mc.cores = getOption("mc.cores", 2L)` is the default for
`parallel::mclapply()`, this will cause `mclapply()` to fall back to
`lapply()` avoiding _forked_ processing. However, this does not work
when the code specifies argument `mc.cores`, e.g. `mclapply(...,
mc.cores = detectCores())`.
SUGGESTION:
Introduce environment variable `R_ENABLE_FORKS` and corresponding R
option `enable.forks` that both take logical scalars. By setting
`R_ENABLE_FORKS=false` or equivalently `enable.forks=FALSE`,
`parallel::mclapply()` will fall back to `lapply()`.
For `parallel::mcparallel()`, we could produce an error if forks are disabled.
Comments?
/Henrik
More information about the R-devel
mailing list