[R] Odds Ratio and Logistic Regression
Lorenzo Isella
lorenzo.isella at gmail.com
Sun Dec 30 19:14:55 CET 2012
Dear All,
I am learning the ropes about logistic regression in R.
I found some interesting examples
http://bit.ly/Vq4GgX
http://bit.ly/W9fUTg
http://bit.ly/UfK73e
but I am a bit lost.
I have several questions.
1) For instance, what is the difference between
glm.out = glm(response ~ poverty + gender, family=binomial(logit),
data=mydata)
and
glm.out = glm(response ~ poverty * gender, family=binomial(logit),
data=mydata)
? Which begs the question when I should use the "*" or "+" sign when doing
a logistic regression on several explanatory variables. I think that in
the former case I am allowing for an interaction between poverty and
gender, but I would like to be sure about it.
2) Consider the following snippet
glm.out = glm(response ~ poverty + gender, family=binomial(logit),
data=mydata)
where "response" is a dichotomous variable, poverty assumes only two
values (Above poverty line and Below poverty line) and gender assumes only
the Male or Female values.
The command above leads to the following output
#######################################
print(summary(glm.out))
Call:
glm(formula = response ~ poverty + gender, family = binomial(logit),
data = mydata)
Deviance Residuals:
Min 1Q Median 3Q Max
-2.2094 0.4269 0.4269 0.8033 1.1911
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.9656 0.1477 6.538 6.25e-11 ***
povertyBelow poverty line -0.9978 0.3246 -3.074 0.00211 **
genderFEMALE 1.3840 0.2549 5.429 5.68e-08 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 494.81 on 499 degrees of freedom
Residual deviance: 457.13 on 497 degrees of freedom
AIC: 463.13
Number of Fisher Scoring iterations: 4
##############################################
To calculate then odds ratios, I should do the following
exp(coef(glm.out))
(Intercept) povertyBelow poverty line
genderFEMALE
2.6263831 0.3687033
3.9909627
but here I am lost about the interpretation. For instance, what are the
odds of a positive response for those above versus below the poverty line
in males? In females?
I think that everything is there, but I cannot extract/interpret the info
provided by R correctly.
Any help is appreciated.
Cheers
Lorenzo
More information about the R-help
mailing list