[R] Odds Ratio and Logistic Regression
Michael Dewey
info at aghmed.fsnet.co.uk
Mon Dec 31 15:24:37 CET 2012
At 18:14 30/12/2012, Lorenzo Isella wrote:
>Dear All,
>I am learning the ropes about logistic regression in R.
>I found some interesting examples
>
>http://bit.ly/Vq4GgX
>http://bit.ly/W9fUTg
>http://bit.ly/UfK73e
>
>but I am a bit lost.
>I have several questions.
>1) For instance, what is the difference between
>
>glm.out = glm(response ~ poverty + gender, family=binomial(logit),
> data=mydata)
>
>and
>
>glm.out = glm(response ~ poverty * gender, family=binomial(logit),
> data=mydata)
>? Which begs the question when I should use the "*" or "+" sign when doing
>a logistic regression on several explanatory variables. I think that in
>the former case I am allowing for an interaction between poverty and
>gender, but I would like to be sure about it.
I think you need to (re)-read any introductory
text on R, in particular about the use of
formulae. The asterisk implies an interaction.
This also answers your second question I think.
>2) Consider the following snippet
>
>
>glm.out = glm(response ~ poverty + gender, family=binomial(logit),
> data=mydata)
>
>where "response" is a dichotomous variable, poverty assumes only two
>values (Above poverty line and Below poverty line) and gender assumes only
>the Male or Female values.
>The command above leads to the following output
>#######################################
>print(summary(glm.out))
>Call:
>glm(formula = response ~ poverty + gender, family = binomial(logit),
> data = mydata)
>
>Deviance Residuals:
> Min 1Q Median 3Q Max
>-2.2094 0.4269 0.4269 0.8033 1.1911
>
>Coefficients:
> Estimate Std. Error z value Pr(>|z|)
>(Intercept) 0.9656 0.1477 6.538 6.25e-11 ***
>povertyBelow poverty line -0.9978 0.3246 -3.074 0.00211 **
>genderFEMALE 1.3840 0.2549 5.429 5.68e-08 ***
>---
>Signif. codes: 0 â€˜***â€™ 0.001 â€˜**â€™ 0.01
>â€˜*â€™ 0.05 â€˜.â€™ 0.1 â€˜ â€™ 1
>
>(Dispersion parameter for binomial family taken to be 1)
>
> Null deviance: 494.81 on 499 degrees of freedom
>Residual deviance: 457.13 on 497 degrees of freedom
>AIC: 463.13
>
>Number of Fisher Scoring iterations: 4
>##############################################
>
>To calculate then odds ratios, I should do the following
>
>exp(coef(glm.out))
> (Intercept) povertyBelow poverty line
>genderFEMALE
> 2.6263831 0.3687033
>3.9909627
>
>but here I am lost about the interpretation. For instance, what are the
>odds of a positive response for those above versus below the poverty line
>in males? In females?
>
>I think that everything is there, but I cannot extract/interpret the info
>provided by R correctly.
>Any help is appreciated.
>Cheers
>
>Lorenzo
>
>
Michael Dewey
info at aghmed.fsnet.co.uk
http://www.aghmed.fsnet.co.uk/home.html
More information about the R-help
mailing list