[R] forward stepwise selection
Prof Brian Ripley
ripley at stats.ox.ac.uk
Wed Jun 7 11:37:40 CEST 2000
> Date: Wed, 07 Jun 2000 09:58:57 +0100
> From: "Simon Bond" <bond at graylab.ac.uk>
> Subject: [R] forward stepwise selection
>
> Dear R-Help,
>
> My problem/bug came to light,when fitting a linear model using stepwise
> selection. I'd started with the straightfoward command
>
> step(lm(y~., dataset))
>
> This worked fine, but because this starts with all the possible
> explanatory variables, it results in a model with too many explanatory
> variables. Hence I wanted to start with just a constant and do forward
> selection, to get a new starting model for full stepwise selection again.
> But R (version 0.99.0) doesn't like this.
Please use a current version of R, and in particular please use a
non-beta version of R. Your logic is not very sound (take another look
at your MSc notes on model selection), and I suggest a better approach
is to increase k in step or to use drop1(, test="F") repeatedly to
reduce the model. Remember AIC is attempting good prediction, not
good explanation.
[...]
> step(lm(ANB.DIFF~1,tink4),scope=list(lower=~1,upper=fmla),direction="forward")
> Start: AIC= 25.35
> ANB.DIFF ~ 1
>
> Error in lm.fit(X, y) : incompatible dimensions
> >
>
>
>
> I've narrowed it down to the command add1(), which uses lm.fit(), but the
> way add1() constructs X and y, is undecipherable. Any advice would be much
> appreciated.
traceback() would have told you immediately where it came from,
and running debug(add1.lm) would enable you to track this down
further.
I cannot reproduce this in 1.0.1:
> library(MASS)
> data(hills)
> step(lm(time ~1, hills), scope=list(lower=~1,upper=~dist+climb),
direction="forward")
Start: AIC= 274.88
time ~ 1
Df Sum of Sq RSS AIC
+ dist 1 71997 13142 211
+ climb 1 55205 29934 240
<none> 85138 275
Step: AIC= 211.49
time ~ dist
Df Sum of Sq RSS AIC
+ climb 1 6249.7 6891.9 190.9
<none> 13141.6 211.5
Step: AIC= 190.9
time ~ dist + climb
Call:
lm(formula = time ~ dist + climb, data = hills)
Coefficients:
(Intercept) dist climb
-8.99204 6.21796 0.01105
If this still fails in 1.0.1 for you, please submit a bug report with a
reproducible example.
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272860 (secr)
Oxford OX1 3TG, UK Fax: +44 1865 272595
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list