[R] alternative to logistic regression
Prof Brian Ripley
ripley at stats.ox.ac.uk
Fri Nov 16 16:28:27 CET 2007
On Fri, 16 Nov 2007, Terry Therneau wrote:
> You can fit a linear probability model with glm and a bit of arm twisting.
> First, make your own copy of the binomial function:
> > dump('binomial', file='mybinom.R')
>
> Edit it to change the function name to "mybinom" (or anything else you
> like), and to add 'identity' to the list of okLinks.
Hmm ... I think you are generalizing from another R-like system.
binomial("identity") works out of the box, and R's glm() will
backtrack if it encounters a fitted value < 0 or > 1. Now, the
back-tracking can get stuck but it often does a reasonable job.
Examples:
set.seed(1)
x <- seq(0, 1, length=10)
y <- rbinom(10, 10, 0.1 + 0.8*x)
glm(cbind(y, 10-y) ~ x, binomial("identity"), start=c(0.5,0))
works. But variants, e.g.
y <- rbinom(10, 10, 0.1 + 0.9*x)
backtrack and give warnings.
What does not work is binomial(identity), unlike binomial(logit).
> Source the file back in, and use mybiom('identity') to fit the model.
>
> Caveat Emptor: This will work just fine if all of your data is far enough away
> from 0 or 1. But if the predicted value for any data point, at any time during
> the iteration, is <=0 or >=1 then the calculation of the log-likelihood will
> involve an NA and the glm routine will fail. NAs produced deep inside a
> computation tend to produce unhelpful and/or misleading error messages
> (sometimes the NA can propogate for some ways through the code before creating a
> failure). You can also get the counterintuitive result that models with few or
> poor covariates work (all the predictions are near to the mean), but more useful
> covariates cause the model to fail.
>
> Linear links for both the binomial and Poisson are a challenging computational
> problem. But they are becoming important in medical work due to recent
> appreciation that the absolute risk attached to a variable is often more
> relevant than the relative risk (odds ratio or risk ratio).
>
> Terry Therneau
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list