[R-sig-ME] Fwd: Re: unplausible zero intercept variance in lmer
Ben Bolker
bbolker at gmail.com
Tue Nov 15 05:56:19 CET 2016
(forwarded from off-line query)
... I have consulted the help pages and noticed that others have
> posted similar problems, but I could not find a satisfactory solution.
> The problem is that lmer evaluates the intercept variance at 0, although
> it is clear that there is variance when you look at the data.
>
> The problem is in the model fit.disteffect (see below for the output).
> This includes only one factor from a two-factorial design. In the full
> model, the intercept variance is non-zero. A reviewer asked for effect
> size information, so I calculated partial models, and then I noticed
> this odd result.
>
> I’d be very glad if you could have a quick look and let me know how to
> circumvent the problem. Code is below, data attached. I am hoping that
> this allows you to reproduce the error.
>
The problem here is really a fundamental one; it's not a mistake,
although you could argue (see below) that it is a limitation of the
class of models that lme4 implements. The basic problem is that realized
(observed) among-group variation (which includes a term due to
among-group variance and a term due to residual variance) is actually
_smaller_ than the among-group variation that would be expected from
independent observations. This is discussed some at
http://tinyurl.com/glmmFAQ#zero-variance (to the general readership of
r-sig-mixed-models : clarifications, feedback, additional references are
always welcome!). Another way of putting it is that the "group effects"
model implemented by lme4 (i.e. y_{ij} = (fixed covariates) + eps_{1,i}
+ eps_{2,ij}) only allows for positive correlations among the
observations within a group.
Data not included in this post, but here is the data manipulation:
library(lme4)
library(reshape2)
dat <- read.csv2("kuc_vod_study3.csv",header=TRUE)
d <- melt(dat,
measure.vars=c("dist_1_20","dist_100_20",
"dist_1_80","dist_100_80"))
d <- transform(d,
variable=factor(variable,levels(variable)[c(1,3,2,4)]),
dist=ifelse(grepl("^dist_1_",variable),-0.5,0.5),
exp=ifelse(grepl("_80$",variable),0.5,-0.5))
## define observations within subjects
library(plyr)
d <- ddply(d,"subj_no",mutate,obs=factor(1:4))
Here's the basic model:
fit.disteffect <- lmer(value ~ dist + (1 | subj_no), data = d)
... and as promised the estimated among-subject variance is zero.
VarCorr(fit.disteffect)
## Groups Name Std.Dev.
## subj_no (Intercept) 0.000
## Residual 37.849
... although the *upper* confidence interval of the among-group std dev
is perfectly sensible.
confint(fit.disteffect,parm="theta_")
Computing profile confidence intervals ...
2.5 % 97.5 %
.sig01 0.00000 7.534858
.sigma 36.27066 39.393044
One solution to this is to use nlme::lme, which allows for a
compound-symmetric model:
library(nlme)
fit.disteffect2 <- lme(value ~ dist,
random=list(subj_no=pdCompSymm(~obs-1)),
data = d)
VarCorr(fit.disteffect2)
## subj_no = pdCompSymm(obs - 1)
## Variance StdDev Corr
## obs1 1255.9355 35.43918
## obs2 1255.9355 35.43918 -0.011
## obs3 1255.9355 35.43918 -0.011 -0.011
## obs4 1255.9355 35.43918 -0.011 -0.011 -0.011
## Residual 176.6014 13.28915
... this shows that the estimated within-group variances are
(slightly) negative
intervals(fit.disteffect2)
## Approximate 95% confidence intervals
...
## Random Effects:
## Level: subj_no
## lower est. upper
## std. dev 0.01073749 35.4391804 1.169673e+05
##corr. -0.16154262 -0.0112256 2.091866e-01
The confidence interval on the correlations (-0.16 to 0.2) seems
reasonable, but the CIs on the standard deviation are ridiculous (i.e.,
the Wald approximation has failed)
The glmmTMB package also has some (experimental!) capacity for fitting
compound-symmetric models, and lme4 _could_ be hacked to do
compound-symmetric models (without touching the underlying code) if one
wanted to badly enough ...
More information about the R-sig-mixed-models
mailing list