[R] Mixed model question.
Douglas Bates
bates at stat.wisc.edu
Tue Jul 29 00:14:09 CEST 2008
On Sun, Jul 27, 2008 at 9:06 PM, Rolf Turner <r.turner at auckland.ac.nz> wrote:
> I continue to struggle with mixed models. The square zero version
> of the problem that I am trying to deal with is as follows:
>
> A number (240) of students are measured (tested; for reading comprehension)
> on 6 separate occasions. Initially (square zero) I want to treat the
> test time as a factor (with 6 levels). The students are of course
> ``random effects''. Later I want to look at reduced models, with the
> means at the 6 times following patterns:
>
> (mu1, mu2, mu2, mu3, mu3, mu4)
>
> or
>
> (mu1, mu1+theta, mu1+theta, mu1+2*theta, mu1+2*theta, mu1+3*theta)
>
> the idea begin that the tests are given ``in pairs'' at the beginning and
> end of the school year. Progress is expected over the school year, no
> progress over the two intervening summers. The summers fall between
> times 2 and 3 and between times 4 and 5. The second of the two models
> assumes a constant amount of progress (``theta'') in each of three
> school years in which the students are observed.
>
> But the square zero model is just a trivial repeated measures model.
>
> The estimate of the means at the 6 observation times is just
> mu-hat = xbar <- apply(X,2,mean) where X is the ``data matrix'',
> and the estimate of the covariance structure is just given by
> Sigma-hat = S <- var(X).
>
> No problem so far. I can also fit the two reduced models (I'm pretty
> sure) via maximum likelihood, using optim(), for example. Assuming
> normal data. Rash assumption, but that's not the issue here.
>
> But the thing is, I also want (later!) to include such things as
> a school effect (there are 6 different schools), a sex effect, and
> an ethnicity effect.
>
> Things start to get complicated --- sounds like a job for lmer().
>
> So I'd just like to get a toe in the door by fitting the trivial
> (square 0) model with lmer() --- and then if I can get my head
> around that, move on to the two reduced models. I.e. I'd like to
> reproduce my simple-minded computations using lmer() --- which would
> give me a little bit of confidence that I'm driving lmer() correctly.
>
> *Can* the trivial model be fitted in lmer()? I tried using
>
> fit <- lmer(y ~ tstnum + (1|stdnt), data=schooldat)
>
> and got estimates of the coefficients for tstnum as follows:
>
> Fixed effects:
> Estimate Std. Error t value
> (Intercept) 3.22917 0.09743 33.14
> tstnum2 0.46667 0.08461 5.52
> tstnum3 0.50000 0.08461 5.91
> tstnum4 0.83750 0.08461 9.90
> tstnum5 0.47083 0.08461 5.56
> tstnum6 0.97500 0.08461 11.52
>
> The mean of (the columns of) the data matrix is
>
> 3.229167 3.695833 3.729167 4.066667 3.700000 4.204167
>
> which is in exact agreement with the lmer() results when converted to
> the same parameterization (mu_i = mu + alpha_i, with alpha_1 = 0).
>
> (Notice the surprizing, depressing, and so far unexplained *drop*
> in the response over the second summer.)
>
> What I *don't* understand is the correlation structure of the estimates
> produced by lmer(), which is:
>
> Correlation of Fixed Effects:
> (Intr) tstnm2 tstnm3 tstnm4 tstnm5
> tstnum2 -0.434
> tstnum3 -0.434 0.500
> tstnum4 -0.434 0.500 0.500
> tstnum5 -0.434 0.500 0.500 0.500
> tstnum6 -0.434 0.500 0.500 0.500 0.500
>
> So apparently the way I called lmer() places substantial constraints
> on the covariance structure.
That's the correlation matrix of the fixed-effects parameters. You
should have separately gotten estimates of the variance-covariance of
the random effects, which you coyly did not show us :-). Because you
are allowing only a simple, scalar random effect per student there
will be an estimate of the variance of this random effect and an
estimate of the residual variance.
> How can I (is there any way that I can)
> tell lmer() to fit the most general possible covariance structure?
It sounds like you want a model formula of
lmer(y ~ tstnum + (0 + tstnum|stdnt), data=schooldat)
but that model will have 21 variance-covariance terms to estimate (22
if you count the residual variance but that one gets profiled out of
the optimization). I would not be surprised if the estimated
variance-covariance matrix for the random effects turns out to be
singular.
> As usual, advice, insight, tutelage humbly appreciated.
> If anyone wishes to experiment with the real data set (it's a bit
> too big to post here) I can make it available to them via email.
Generally I would jump at the chance but not with my "To Do" list in
its current, sadly over-committed, state.
More information about the R-help
mailing list