[R] Parallelizing GBM

Lorenzo Isella lorenzo.isella at gmail.com
Sun Mar 24 12:31:56 CET 2013


Dear All,
I am far from being a guru about parallel programming.
Most of the time, I rely or randomForest for data mining large datasets.
I would like to give a try also to the gradient boosted methods in GBM,  
but I have a need for parallelization.
I normally rely on gbm.fit for speed reasons, and I usually call it this  
way



gbm_model <- gbm.fit(trainRF,prices_train,
offset = NULL,
misc = NULL,
distribution = "multinomial",
w = NULL,
var.monotone = NULL,
n.trees = 50,
interaction.depth = 5,
n.minobsinnode = 10,
shrinkage = 0.001,
bag.fraction = 0.5,
nTrain = (n_train/2),
keep.data = FALSE,
verbose = TRUE,
var.names = NULL,
response.name = NULL)


Does anybody know an easy way to parallelize the model (in this case it  
means simply having 4 cores on the same machine working on the problem)?
Any suggestion is welcome.
Cheers

Lorenzo



More information about the R-help mailing list