[R] trouble with step() and stepAIC() selecting the best model
natalia norden
norden at cict.fr
Fri May 5 11:14:18 CEST 2006
Hello,
I have some trouble using step() and stepAIC() functions.
I'm predicting recruitment against several factors for different plant
species using a negative binomial glm.
Sometimes, summary(step(model)) or summary(stepAIC(model) does not
select the best model (lowest AIC) but just stops before.
For some species, step() works and stepAIC don't and in others, it's the
opposite.
The model is
model=glm.nb(sdg.cens1 ~ seed.cens1 + log(in.lit+1) +
log(trans.dir+1)+ log(trans.dif+1) + slope +log(pH+1) + log(CN+1) + site,
data=data)
For example, for species A using stepAIC() I get:
Call:
glm.nb(formula = tot.rec ~ slope + log(pH + 1) + log(CN + 1),
data = data, init.theta = 0.0575099635396228, link = log)
Deviance Residuals:
Min 1Q Median 3Q Max
-7.591e-01 -3.688e-01 -1.828e-01 -8.494e-08 1.520e+00
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 110.729 47.643 2.324 0.02012 *
slope2 -34.202 972079.452 -3.52e-05 0.99997
slope3 -1.423 1.371 -1.038 0.29928
log(pH + 1) -51.244 19.544 -2.622 0.00874 **
log(CN + 1) -9.132 6.906 -1.322 0.18602
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for Negative Binomial(0.0575) family taken to be 1)
Null deviance: 34.068 on 156 degrees of freedom
Residual deviance: 19.602 on 152 degrees of freedom
AIC: 90.357
But using step() I get:
Step: AIC= 138.4
tot.rec ~ log(mean.lit.rec + 1) + log(trans.dir + 1) + log(trans.dif +
1) + slope + log(pH + 1) + log(CN + 1) + site
Call:
glm.nb(formula = tot.rec ~ log(mean.lit.rec + 1) + log(trans.dir +
1) + log(trans.dif + 1) + slope + log(pH + 1) + log(CN +
1) + site, data = data, init.theta = 74070.0501196965, link = log)
Deviance Residuals:
Min 1Q Median 3Q Max
-2.3110 -0.5450 -0.3179 0.0000 3.8755
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 6.809e+01 1.905e+01 3.574 0.000351 ***
log(mean.lit.rec + 1) -1.238e+00 1.083e+00 -1.143 0.252987
log(trans.dir + 1) 2.069e-01 1.227e+00 0.169 0.866120
log(trans.dif + 1) -6.978e-01 1.578e+00 -0.442 0.658327
slope2 -3.730e+01 1.022e+07 -3.65e-06 0.999997
slope3 -9.195e-01 6.269e-01 -1.467 0.142488
log(pH + 1) -3.152e+01 6.918e+00 -4.556 5.22e-06 ***
log(CN + 1) -5.053e+00 3.404e+00 -1.485 0.137643
site1 1.379e-01 4.618e-01 0.299 0.765278
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for Negative Binomial(91071.54) family taken to be 1)
Null deviance: 145.586 on 156 degrees of freedom
Residual deviance: 99.166 on 148 degrees of freedom
AIC: 140.40
However, for species B I get the opposite, it works with step() but not
with stepAIC().
Using step() AIC =414.28 and using stepAIC(), AIC=424.15 and the results
are totally different.
I would appreciate any help!!
Thank you
Natalia Norden
More information about the R-help
mailing list