[R] A logit question?
Renaud Lancelot
lancelot at sentoo.sn
Mon May 6 15:12:40 CEST 2002
It might be possible to use frequency data, if you mean proportions.
There are two ways: either the response is a matrix (successes vs
failures) or a vector of proportions (successes / (successes + failures)
). In the latter case, you will have to use a "weight" argument, with
the weight = the denominator of the proportion.
See help on glm:
?glm
[snip]
Details:
A typical predictor has the form `response ~ terms' where
`response' is the (numeric) response vector and `terms' is a
series of terms which specifies a linear predictor for `response'.
For `binomial' models the response can also be specified as a
`factor' (when the first level denotes failure and all others
success) or as a two-column matrix with the columns giving the
numbers of successes and failures. A terms specification of the
form `first + second' indicates all the terms in `first' together
with all the terms in `second' with duplicates removed.
[snip]
Hope this helps,
Renaud
Achim Zeileis wrote:
>
> Mäkinen Jussi wrote:
> >
> > I have got few answers which has pointed out that usually logit-model is for
> > a binary response (dependent) variable. And this was a part of my (obviously
> > badly written) question: is it possible to regress frequency data (e.g. not
> > binary response) with glm(y~x, family=binomial(link=logit))?
> >
> > glm-help says:
> >
> > <snip>...For `binomial' models the response can also be specified as a
> > `factor' (when the first level denotes failure and all others success) or as
> > a two-column matrix with the columns giving the numbers of successes and
> > failures....<snip>
> >
> > which led me think that it can handle frequency data (grouped data) as well.
> >
> > But that should give the same result as transforming response and running
> > regular OLS?
>
> No, glm() gives you the ML estimate for the regression coefficients.
> Only for the gaussian family ML and OLS are the same.
> Z
>
> > Jussi
> >
> > Mäkinen Jussi wrote:
> >
> > >Hello dear r-gurus!
> > >
> > >I have a question about the logit-model. I think I have misunderstood
> > >something and I'm trying to find a bug from my code or even better from my
> > >head. Any help is appreciated.
> > >
> > >The question is shortly: why I'm not having same coefficients from the
> > >logit-regression when using a link-function and an explicite transformation
> > >of the dependent. Below some details.
> > >
> > >I'm not very familiar with the concept. As far as I have understood it's
> > all
> > >about transformation of the dependent variable if one have frequency data
> > >(grouped data, instead of raw binaries):
> > >
> > >ln(^p(i)/(1-^p(i)) = c + b_1(X_1) +...+ b_k(X_k) + e(i).
> > >
> > >where ^p(i) is (estimated) frequency of incident (happened/all = n(i)/N), i
> > >is index of observation, c and b_. are coefficients (objects of the
> > >estimation), X_. are the explanatory variables and e is residual. So a
> > >linear regression.
> > >
> > >And some testing:
> > >
> > >
> > >>y <- runif(100)
> > >>
> > Should you use a binomial (0,1) response variable?
> >
> > best regards!
> >
> > >>
> > >>X <- rnorm(100)
> > >>glm(y~ X, family=binomial(link=logit))
> > >>
> >
> > >>
> > >
> > >Call: glm(formula = y ~ X, family = binomial(link = logit))
> > >
> > >Coefficients:
> > >(Intercept) X
> > > -0.00956 0.10760
> > >
> > >Degrees of Freedom: 99 Total (i.e. Null); 98 Residual
> > >Null Deviance: 43.83
> > >Residual Deviance: 43.49 AIC: 142.3
> > >Warning message:
> > >non-integer #successes in a binomial glm! in: eval(expr, envir, enclos)
> > >
> > >
> > >
> > >### OR
> > >
> > >>glm(cbind(y, 1-y)~ X, family=binomial(link=logit)) ### ?glm
> > >>
> > >
> > >Call: glm(formula = cbind(y, 1 - y) ~ X, family = binomial(link = logit))
> > >
> > >Coefficients:
> > >(Intercept) X
> > > -0.00956 0.10760
> > >
> > >Degrees of Freedom: 99 Total (i.e. Null); 98 Residual
> > >Null Deviance: 43.83
> > >Residual Deviance: 43.49 AIC: 142.3
> > >Warning message:
> > >non-integer counts in a binomial glm! in: eval(expr, envir, enclos)
> > >
> > >
> > >
> > >### BUT
> > >
> > >>glm(y.logit.transformation(y)~ X)
> > >>
> > >
> > >Call: glm(formula = y.logit.transformation(y) ~ X)
> > >
> > >Coefficients:
> > >(Intercept) X
> > > 0.1233 0.1023
> > >
> > >Degrees of Freedom: 99 Total (i.e. Null); 98 Residual
> > >Null Deviance: 465.6
> > >Residual Deviance: 464.4 AIC: 443.3
> > >
> > >
> > >### OR
> > >
> > >>lm(y.logit.transformation(y)~ X)
> > >>
> > >
> > >Call:
> > >lm(formula = y.logit.transformation(y) ~ X)
> > >
> > >Coefficients:
> > >(Intercept) X
> > > 0.1233 0.1023
> > >
> > >
> > >It's close (AIC and residual deviance is different due transformation) but
> > I
> > >think that relationship should be exact? Or is it just calculation
> > >inaccurance? Or is there some hidden reason (to me..)? Is it legimitate to
> > >use frequency regression when using R for the logit-model (alternatives?).
> > >
> > >I would like to know what does exactly mean the warning message:
> > >non-integer counts in a binomial glm! in: eval(expr, envir, enclos)
> > >
> > >For the dependent transformation:
> > >
> > >"y.logit.transformation" <- function(y)
> > >{
> > > y.trans <- log(y/(1-y))
> > > y.trans
> > >}
> > >
> > >version
> > >
> > >platform i386-pc-mingw32
> > >arch i386
> > >os mingw32
> > >system i386, mingw32
> > >status
> > >major 1
> > >minor 5.0
> > >year 2002
> > >month 04
> > >day 29
> > >language R
> > >
> > >OS is Windows2000.
> > >
> > >Thank you for any help.
> > >
> > >deadlocked,
> > >
> > >Jussi Mäkinen
> > >Analyst
> > >State Treasury, Finland
> > >phone: +358-9-7725 616
> > >mobile: +358-50-5958 710
> > >www.statetreasury.fi
> > >mailto:jussi.makinen at valtiokonttori.fi
> > >
> > >
> > >
> > >-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> > .-.-
> > >r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> > >Send "info", "help", or "[un]subscribe"
> > >(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
> > >_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
> > ._._
> > >
> >
> > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> > Send "info", "help", or "[un]subscribe"
> > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
> > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
--
Dr Renaud Lancelot, vétérinaire
CIRAD, Département Elevage et Médecine Vétérinaire (CIRAD-Emvt)
Programme Productions Animales
http://www.cirad.fr/presentation/programmes/prod-ani.shtml (Français)
http://www.cirad.fr/presentation/en/program-eng/prod-ani.shtml (English)
ISRA-LNERV tel (221) 832 49 02
BP 2057 Dakar-Hann fax (221) 821 18 79 (CIRAD)
Senegal e-mail renaud.lancelot at cirad.fr
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list