[R] Automatic creation of binary logistic models
Marc Schwartz
marc_schwartz at me.com
Thu Aug 4 22:35:33 CEST 2011
On Aug 4, 2011, at 2:23 PM, Paul Smith wrote:
> Dear All,
>
> Suppose that you are trying to create a binary logistic model by
> trying different combinations of predictors. Has R got an automatic
> way of doing this, i.e., is there some way of automatically generating
> different tentative models and checking their corresponding AIC value?
> If so, could you please direct me to an example?
>
> Thanks in advance,
>
> Paul
Hi Paul,
If it were not for JSS going on at the moment, you would likely get a reply from Frank Harrell telling you why using this approach is not a good idea. This is tantamount to using a stepwise approach with variables going in and out of the model, based upon either AIC or perhaps Wald p values.
If you search the R list archives using rseek.org with keywords such as "stepwise regression Harrell", you will see a plethora of discussions on this over the years.
You might want to obtain a copy of Frank's book Regression Modeling Strategies along with Ewout Steyerberg's book Clinical Prediction Models, which cover this topic and offer alternative solutions to model development. These generally include the pre-specification of full models, considering how many covariate degrees of freedom you can reasonably include in the model and applying shrinkage/penalization.
If you need to engage in data reduction, you might want to consider using the LASSO, as implemented in the glmnet package on CRAN. More information on this method is available at: http://www-stat.stanford.edu/~tibs/lasso.html. An alternative might be backward elimination, which Frank does touch on and covers in:
http://biostat.mc.vanderbilt.edu/wiki/pub/Main/RmS/rms.pdf
which is a supplement to his course.
Automated creation of models ignores the expertise of both the statistician and subject matter experts, to the detriment of inference.
Regards,
Marc Schwartz
More information about the R-help
mailing list