[R] From THE R BOOK -> Warning: In eval(expr, envir, enclos) : non-integer #successes in a binomial glm!
Berwin A Turlach
berwin at maths.uwa.edu.au
Tue Mar 30 18:52:38 CEST 2010
G'day all,
On Tue, 30 Mar 2010 16:19:46 +0100
Corrado <ct529 at york.ac.uk> wrote:
> David Winsemius wrote:
> > A) It is not an error, only a warning. Wouldn't it seem reasonable
> > to issue such a warning if you have data that violates the
> > distributional assumptions?
> I am not questioning the approach. I am only trying to understand why
> a (rather expensive) source of documentation and the behaviour of a
> function are not aligned.
1) Also expensive books have typos in them.
2) glm() is from a package that is part of R and the author of this
book is AFAIK not a member of R core, hence has no control on
whether his documentation and the behaviour of a function are
aligned.
a) If he were documenting a function that was part of a package he
wrote as support for his book, as some authors do, there might be
a reason to complain. But then 1) would still apply.
b) Even books written by members of R core have occasionally
misalignments between the behaviour of a function and the
documentation contained in such books. This can be due to them
documenting a function over whose implementation they do not have
control (e.g. a function in a contributed package) or the fact
that R is improving/changing from version to version while books
are rather static.
For these reasons it is always worthwhile to check the errata page for
a book, if such exists.
The source of the warning is due to the fact that you do not provide
all necessary information about your response. If your response is
binomial (with a mean depended on some explanatory variables), then
each response consists of two numbers, the number of trials and the
number of success. If you calculate the observed proportion of
successes from these two numbers and feed this into glm as the
response, you are omitting necessary information. In this case, you
should provide the number of trials on which each proportion is based
as prior weights. For example:
R> x <- seq(from=-1,to=1,length=41)
R> px <- exp(x)/(1+exp(x))
R> nn <- sample(8:12, 41, replace=TRUE)
R> yy <- rbinom(41, size=nn, prob=px)
R> y <- yy/nn
R> glm(y~x, family=binomial, weights=nn)
Call: glm(formula = y ~ x, family = binomial, weights = nn)
Coefficients:
(Intercept) x
0.246 1.124
Degrees of Freedom: 40 Total (i.e. Null); 39 Residual
Null Deviance: 91.49
Residual Deviance: 50.83 AIC: 157.6
R> glm(y~x, family=binomial)
Call: glm(formula = y ~ x, family = binomial)
Coefficients:
(Intercept) x
0.2143 1.1152
Degrees of Freedom: 40 Total (i.e. Null); 39 Residual
Null Deviance: 9.256
Residual Deviance: 5.229 AIC: 49.87
Warning message:
In eval(expr, envir, enclos) : non-integer #successes in a binomial glm!
HTH,
Cheers,
Berwin
========================== Full address ============================
Berwin A Turlach Tel.: +61 (8) 6488 3338 (secr)
School of Maths and Stats (M019) +61 (8) 6488 3383 (self)
The University of Western Australia FAX : +61 (8) 6488 1028
35 Stirling Highway
Crawley WA 6009 e-mail: berwin at maths.uwa.edu.au
Australia http://www.maths.uwa.edu.au/~berwin
More information about the R-help
mailing list