[R] Problem with caret + foreach + M5 combination
Antoine Stevens
Antoine.Stevens at uclouvain.be
Wed Sep 21 11:53:15 CEST 2011
Hello,
I often use the caret package to develop regression models and compare
their performance. The foreach package is integrated in caret and can be
used to speed up the process through parallel computations.
Since caret version 5.01-001, one just need to register the cores with one
of the "do" packages (doMC,etc). This works fine in most of the
situations, but there is a problem when I use the M5 algorithm.
Everything works well with only one core but computations seem to be stuck
with 2 or more registered cores.
I am using Rstudio with R 2.13.1 on a Redhat Linux 64-bit machine.
Here is an example:
library(doMC);library(randomForest);library(RWeka); library(caret)
library(mlbench)
data(BostonHousing)
registerDoMC()
options(cores=1)
withoutMC <- train(medv ~ ., data = BostonHousing, "rf")#Works with
random forest
options(cores=2)
usingMC <- train(medv ~ ., data = BostonHousing, "rf")#Works with random
forest
options(cores=1)
withoutMC <- train(medv ~ ., data = BostonHousing, "M5")#Works with M5
options(cores=2)
usingMC <- train(medv ~ ., data = BostonHousing, "M5")#Does not work
So I tried with another parallel backend (doSNOW), but got another error.
library(doSNOW)
cl <- makeCluster(2, type = "SOCK")
clusterEvalQ(cl,library(caret))
clusterEvalQ(cl,library(RWeka))
[[1]]
[1] "RWeka" "caret" "foreach" "codetools" "iterators" "cluster"
"reshape" "plyr"
[9] "lattice" "snow" "methods" "stats" "graphics"
"grDevices" "utils" "datasets"
[17] "base"
[[2]]
[1] "RWeka" "caret" "foreach" "codetools" "iterators" "cluster"
"reshape" "plyr"
[9] "lattice" "snow" "methods" "stats" "graphics"
"grDevices" "utils" "datasets"
[17] "base"
registerDoSNOW(cl)
usingSNOW <- train(medv ~ ., data = BostonHousing, "M5")# Does not work
Error in { :
task 1 failed - "could not find function "predictionFunction""
Here, it does not find "predictionFunction" (part of the caret package I
believe),
while I loaded the package into the clusters with clusterEvalQ.
Any suggestions?
Here are my sessionInfo() and package versions:
sessionInfo()
R version 2.13.1 (2011-07-08)
Platform: x86_64-redhat-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=C
LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8
LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] doSNOW_1.0.5 snow_0.3-7 caret_5.01-001
cluster_1.14.0 reshape_0.8.4
[6] plyr_1.6 lattice_0.19-30 mlbench_2.1-0 RWeka_0.4-8
randomForest_4.6-2
[11] doMC_1.2.3 multicore_0.1-7 foreach_1.3.2
codetools_0.2-8 iterators_1.0.5
loaded via a namespace (and not attached):
[1] compiler_2.13.1 grid_2.13.1 rJava_0.9-1
RWekajars_3.7.4-1 tools_2.13.1
packageDescription("caret")
Package: caret
Version: 5.01-001
Date: 2011-09-01
Title: Classification and Regression Training
Author: Max Kuhn. Contributions from Jed Wing, Steve Weston, Andre
Williams, Chris Keefer and
Allan Engelhardt
Description: Misc functions for training and plotting classification and
regression models
Maintainer: Max Kuhn <Max.Kuhn at pfizer.com>
Depends: R (>= 2.10), lattice, reshape, stats, plyr, cluster, foreach
URL: http://caret.r-forge.r-project.org/
Suggests: gbm, pls, mlbench, rpart, ellipse, ipred, klaR, randomForest,
gpls, pamr, kernlab, mda,
mgcv, nnet, class, MASS, mboost, earth (>= 2.2-3), party (>=
0.9-99992), ada, affy,
proxy, e1071, grid, elasticnet, SDDA, caTools, RWeka (>=
0.4-1), superpc, penalized,
sparseLDA (>= 0.1-1), spls, sda, glmnet, relaxo, lars, vbmp,
nodeHarvest, rrcov, gam,
stepPlr, GAMens (>= 1.1.1), rocc, foba, partDSA, hda, fastICA,
neuralnet,
quantregForest, rda, HDclassif, LogicReg, LogicForest, logicFS,
RANN, qrnn, Boruta,
Hmisc, Cubist, bst, leaps
License: GPL-2
Packaged: 2011-09-02 14:09:50 UTC; kuhna03
Repository: CRAN
Date/Publication: 2011-09-02 18:25:30
Built: R 2.13.1; x86_64-redhat-linux-gnu; 2011-09-16 18:51:38 UTC; unix
Thank you very much,
Antoine Stevens
Earth and Life Institute
UCLouvain
Belgium
More information about the R-help
mailing list