[R] GLM Starting Values
Gabor Grothendieck
ggrothendieck at gmail.com
Fri Jul 23 02:51:45 CEST 2010
On Thu, Jul 22, 2010 at 4:56 PM, Tyler Williamson <tswillia at ucalgary.ca> wrote:
> Hello,
>
> Suppose one is interested in fitting a GLM with a log link to binomial data. How does R choose starting values for the estimation procedure? Assuming I don't supply them.
>
Assuming weights are not specified it uses this if there is a one
column response:
mustart <- (y + 0.5) / 2
and this if there is a two column response:
n <- y[, 1] + y[, 2]
mustart <- (n * y + 0.5) / (n + 1)
etastart is the link function evaluated at mustart. For example,
given this data:
set.seed(123)
f1 <- factor(sample(c("a", "b"), 100, replace = TRUE))
f2 <- factor(sample(c("x", "y"), 100, replace = TRUE))
y <- sample(c(0, 1), 100, replace = TRUE)
# compare these two:
# default mustart and etastart
fm <- glm(y ~ f1 + f2, family = "binomial", control = list(trace = TRUE))
# specify mustart and etastart to equal defaults
mustart <- (y + 0.5) / 2
fm <- glm(y ~ f1 + f2, family = "binomial",
mustart = mustart, etastart = qlogis(mustart),
control = list(trace = TRUE))
More information about the R-help
mailing list