[R] Logistic regression goodness of fit tests
Trevor Wiens
twiens at interbaun.com
Thu Mar 10 22:06:18 CET 2005
I was unsure of what suitable goodness-of-fit tests existed in R for logistic regression. After searching the R-help archive I found that using the Design models and resid, could be used to calculate this as follows:
d <- datadist(mydataframe)
options(datadist = 'd')
fit <- lrm(response ~ predictor1 + predictor2..., data=mydataframe, x =T, y=T)
resid(fit, 'gof').
I set up a script to first use glm to create models use stepAIC to determine the optimal model. I used this instead of fastbw because I found the AIC values to be completely different and the final models didn't always match. Then my script takes the reduced model formula and recreates it using lrm as above. Now the problem is that for some models I run into an error to which I can find no reference whatsoever on the mailing list or on the web. It is as follows:
test.lrm <- lrm(cclo ~ elev + aspect + cti_var + planar + feat_div + loamy + sands + sandy + wet + slr_mean, data=datamatrix, x = T, y = T)
singular information matrix in lrm.fit (rank= 10 ). Offending variable(s):
slr_mean
Error in j:(j + params[i] - 1) : NA/NaN argument
Now if I add the singularity criterion and make the value smaller than the default of 1E-7 to 1E-9 or 1E-12 which is the default in calibrate, it works. Why is that?
Not being a statistician but a biogeographer using regression as a tool, I don't really understand what is happening here.
Does changing the tol variable, change how I should interpret goodness-of-fit results or other evaluations of the models created?
I've included a summary of the data below (in case it might be helpful) with all variables in the data frame as it was easier than selecting out the ones used in the model.
Thanks in advance.
T
--
Trevor Wiens
twiens at interbaun.com
The significant problems that we face cannot be solved at the same
level of thinking we were at when we created them.
(Albert Einstein)
----------------------------
summary(datamatrix)
siteid block recordyear cclo
564-125: 5 Min. :1.000 Min. :2000 Min. :0.0000
564-130: 5 1st Qu.:2.000 1st Qu.:2001 1st Qu.:1.0000
564-135: 5 Median :3.000 Median :2002 Median :1.0000
564-140: 5 Mean :3.042 Mean :2002 Mean :0.7509
564-145: 5 3rd Qu.:4.000 3rd Qu.:2003 3rd Qu.:1.0000
564-150: 5 Max. :5.000 Max. :2004 Max. :1.0000
(Other):1098
elev slope aspect slr_mean
Min. :0.0000 Min. :0.1499 Min. :0.0000 Min. :7681
1st Qu.:0.0000 1st Qu.:0.5876 1st Qu.:0.0000 1st Qu.:7852
Median :1.0000 Median :0.9195 Median :0.0000 Median :7877
Mean :0.6259 Mean :1.2523 Mean :0.2482 Mean :7871
3rd Qu.:1.0000 3rd Qu.:1.6694 3rd Qu.:0.0000 3rd Qu.:7892
Max. :1.0000 Max. :5.3366 Max. :1.0000 Max. :7981
cti cti_var planar feat_div
Min. :7.157 Min. :0.4497 Min. :0.0000 Min. :1.000
1st Qu.:7.651 1st Qu.:0.6187 1st Qu.:1.0000 1st Qu.:2.000
Median :7.720 Median :0.8495 Median :1.0000 Median :3.000
Mean :7.763 Mean :0.9542 Mean :0.8254 Mean :3.379
3rd Qu.:7.822 3rd Qu.:1.1918 3rd Qu.:1.0000 3rd Qu.:4.000
Max. :8.769 Max. :2.5615 Max. :1.0000 Max. :6.000
chop_san loamy sands sandy
Min. :0.00000 Min. :0.0000 Min. :0.0000 Min. :0.0000
1st Qu.:0.00000 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.0000
Median :0.00000 Median :0.0000 Median :0.0000 Median :0.0000
Mean :0.05762 Mean :0.3094 Mean :0.3236 Mean :0.1099
3rd Qu.:0.00000 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:0.0000
Max. :1.00000 Max. :1.0000 Max. :1.0000 Max. :1.0000
wet timesinceburn ndvi evi
Min. :0.00000 Min. : 1.00 Min. :0.1140 Min. :0.1041
1st Qu.:0.00000 1st Qu.:100.00 1st Qu.:0.2973 1st Qu.:0.1667
Median :0.00000 Median :100.00 Median :0.3342 Median :0.2027
Mean :0.01950 Mean : 87.84 Mean :0.3629 Mean :0.2184
3rd Qu.:0.00000 3rd Qu.:100.00 3rd Qu.:0.4463 3rd Qu.:0.2711
Max. :1.00000 Max. :100.00 Max. :0.5932 Max. :0.4788
msavi2 fc gdd precip
Min. :0.09156 Min. :0.1552 Min. :380.6 Min. : 50.04
1st Qu.:0.14936 1st Qu.:0.3246 1st Qu.:492.8 1st Qu.: 76.17
Median :0.18257 Median :0.4082 Median :500.8 Median : 85.50
Mean :0.19653 Mean :0.4398 Mean :476.4 Mean : 94.35
3rd Qu.:0.24626 3rd Qu.:0.5630 3rd Qu.:501.6 3rd Qu.: 95.16
Max. :0.33258 Max. :0.6996 Max. :519.7 Max. :163.86
precip_1 precip_2 slr_yr
Min. :164.2 Min. :164.2 Min. :7417
1st Qu.:254.2 1st Qu.:254.2 1st Qu.:7704
Median :338.0 Median :357.1 Median :7775
Mean :298.1 Mean :301.5 Mean :7828
3rd Qu.:357.1 3rd Qu.:360.5 3rd Qu.:8014
Max. :414.2 Max. :414.2 Max. :8151
More information about the R-help
mailing list