[R-sig-ME] Differences in standard and mixed effects logistic regression - rounding error or conceptual mistake?

Wed Oct 26 18:16:48 CEST 2016

I'm a bit confused. To my understanding, the standard logistic regression
should be equivalent to a mixed effect logistic regression where the
statistical unit is defined as random effect - but I found the results of
the two analysis to be different.

As example, let's say that I have an experiment with 20 participants. To
each participant I ask x questions, and their answer is scored either as
either "right" or "wrong".

I simulated the dataset with:

*data0 = cbind(data.frame(matrix(sample(1:10, 40, replace=TRUE), nrow = 20,
byrow = TRUE)), rep(c('a', 'b'), each=10))names(data0) = c('s','f','c')*

Let's say that the column *s* has the number of right answers, the column
*f* has the number of wrong answer, and the column *c* represents a
between-subject experimental condition. The subjects are organized per rows.

To test the effect of *c* I can write:

*anova(glm ( cbind( s, f) ~ 1, data = data0, family = binomial), glm (
cbind( s, f) ~ 1 + c, data = data0, family = binomial), test="Chisq")*

which gives me:

*Model 1: cbind(s, f) ~ 1Model 2: cbind(s, f) ~ 1 + c  Resid. Df Resid. Dev
Df Deviance Pr(>Chi)1        19     49.1442        18     42.978  1
6.1659  0.01302 **

So I tried to model the same data with a logistic mixed effect model.

First I converted it to a long format with:

*data1=NULLfor (i in 1 : nrow( data0) ) {tt = c(rep(1, data0$s[i]), rep(0,
data0$f[i]))t = cbind(tt, rep(i, length(tt)), rep(data0$c[i],
length(tt)))data1 = rbind( data1, t)}data1 = data.frame(data1)names(data1)
= c("r","s", "c")data1$s = factor(data1$s)data1$c = factor(data1$c)*
Where *s* is the subject ID. Then I tested the effect of *c* with:

*library(lme4)anova(glmer( r ~ 1 + ( 1| s), data = data1, family =
binomial), glmer( r ~ 1 + c + ( 1| s), data = data1, family = binomial),
test="Chisq")*

Now the same analysis gives me a different result!

*glmer(r ~ 1 + (1 | s), data = data1, family = binomial): r ~ 1 + (1 |
s)glmer(r ~ 1 + c + (1 | s), data = data1, family = binomial): r ~ 1 + c +
(1 | s)                                                            Df
 AIC    BICglmer(r ~ 1 + (1 | s), data = data1, family = binomial)      2
290.72 297.45glmer(r ~ 1 + c + (1 | s), data = data1, family = binomial)  3
289.97 300.07
 logLik devianceglmer(r ~ 1 + (1 | s), data = data1, family = binomial)
-143.36   286.72glmer(r ~ 1 + c + (1 | s), data = data1, family = binomial)
-141.98   283.97
 Chisq Chi Dfglmer(r ~ 1 + (1 | s), data = data1, family = binomial)glmer(r
~ 1 + c + (1 | s), data = data1, family = binomial) 2.7548      1
                                                  Pr(>Chisq)glmer(r ~ 1 +
(1 | s), data = data1, family = binomial)glmer(r ~ 1 + c + (1 | s), data =
data1, family = binomial)    0.09696 .*

Is it because of rounding errors in the implementation of the mathematical
functions, or there is some fundamental concept I'm missing here?

PS: the results of the two analysis can be closer that in the case I
reported - I cherry picked an extreme example for the sake of a clear
explanation.

	[[alternative HTML version deleted]]