[R] problems with glm
stephenc at ics.mq.edu.au
stephenc at ics.mq.edu.au
Tue Oct 2 05:34:06 CEST 2007
I am having a couple of problems someone may be able to cast some light on.
Question 1:
I am making a logistic model but when i do this:
glm.model = glm(as.factor(form$finished) ~ ., family=binomial,
data=form[1:150000,])
I get this:
Error in model.frame(formula, rownames, variables, varnames, extras,
extranames, :
variable lengths differ (found for 'barrier')
which is very strange because when I name the predictive factors like this:
glm.model = glm(as.factor(form$finished) ~ form$first + form$second +
form$third + form$barrier, family=binomial, data=form[1:150000,])
It produces a model:
Call:
glm(formula = as.factor(form$finished) ~ form$first + form$second +
form$third + form$barrier, family = binomial, data = form[1:150000,
])
Deviance Residuals:
Min 1Q Median 3Q Max
-3.0884 -0.4932 -0.3951 -0.3006 2.7135
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.957831 0.021446 -137.920 < 2e-16 ***
form$first 0.624463 0.078036 8.002 1.22e-15 ***
form$second 0.754057 0.080787 9.334 < 2e-16 ***
form$third 7.718261 0.078532 98.281 < 2e-16 ***
form$barrier -0.058185 0.002175 -26.751 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 144850 on 215213 degrees of freedom
Residual deviance: 133292 on 215209 degrees of freedom
AIC: 133302
Number of Fisher Scoring iterations: 5
Any idea why the first glm call doesn;t work?
Second Question:
Now I want to predict so i do this:
pred <- predict(glm.model,data=form[150001:20000,],type="response")
but when I try to use it I get this:
> pred <- predict(glm.model,data=form[150001:200000,],type="response")
> t = table(pred,form$finished[150001:200000])
Error in table(pred, form$finished[150001:2e+05]) :
all arguments must have the same length
and when I do this it confirms my pred is not 50000 long!
> length(pred)
[1] 215214
It doesn't look as though my slection of rows has worked at all. Anyone
know why?
Stephen
More information about the R-help
mailing list