[R] Distributions for gbm models

Patrick Connolly p_connolly at slingshot.co.nz
Thu Dec 14 09:28:22 CET 2017

On page 409 of "Applied Predictive Modeling" by Max Kuhn, it states
that the gbm function can accomodate only two class problems when
referring to the distribution parameter.  

>From gbm help re: the distribution parameter:

          Currently available options are "gaussian" (squared error),
          "laplace" (absolute loss), "tdist" (t-distribution loss),
          "bernoulli" (logistic regression for 0-1 outcomes),
          "huberized" (huberized hinge loss for 0-1 outcomes),
          "multinomial" (classification when there are more than 2
          classes), "adaboost" (the AdaBoost exponential loss for 0-1
          outcomes), "poisson" (count outcomes), "coxph" (right
          censored observations), "quantile", or "pairwise" (ranking
          measure using the LambdaMart algorithm).

I would have thought that huberized and multinomial would also be
possible.  Is that not so?  In any case, how would anything different
from bernoulli (the default) be specified when using the caret train
function since distribution appears not to be among the list of
parameters that caret recognises?

> getModelInfo("gbm")[["gbm"]]$parameters
          parameter   class                   label
1           n.trees numeric   # Boosting Iterations
2 interaction.depth numeric          Max Tree Depth
3         shrinkage numeric               Shrinkage
4    n.minobsinnode numeric Min. Terminal Node Size

Is that a limitation of the caret package?  Or is there something I'm
not getting?

   ___    Patrick Connolly   
 {~._.~}                   Great minds discuss ideas    
 _( Y )_  	         Average minds discuss events 
(:_~*~_:)                  Small minds discuss people  
 (_)-(_)  	                      ..... Eleanor Roosevelt

More information about the R-help mailing list