[R] Mixed model question.

Tue Jul 29 00:14:09 CEST 2008

On Sun, Jul 27, 2008 at 9:06 PM, Rolf Turner <r.turner at auckland.ac.nz> wrote:

> I continue to struggle with mixed models.  The square zero version
> of the problem that I am trying to deal with is as follows:
>
> A number (240) of students are measured (tested; for reading comprehension)
> on 6 separate occasions.  Initially (square zero) I want to treat the
> test time as a factor (with 6 levels).  The students are of course
> ``random effects''.  Later I want to look at reduced models, with the
> means at the 6 times following patterns:
>
>        (mu1, mu2, mu2, mu3, mu3, mu4)
>
> or
>
>        (mu1, mu1+theta, mu1+theta, mu1+2*theta, mu1+2*theta, mu1+3*theta)
>
> the idea begin that the tests are given ``in pairs'' at the beginning and
> end of the school year.  Progress is expected over the school year, no
> progress over the two intervening summers.  The summers fall between
> times 2 and 3 and between times 4 and 5.  The second of the two models
> assumes a constant amount of progress (``theta'') in each of three
> school years in which the students are observed.
>
> But the square zero model is just a trivial repeated measures model.
>
> The estimate of the means at the 6 observation times is just
> mu-hat = xbar <- apply(X,2,mean) where X is the ``data matrix'',
> and the estimate of the covariance structure is just given by
> Sigma-hat = S <- var(X).
>
> No problem so far.  I can also fit the two reduced models (I'm pretty
> sure) via maximum likelihood, using optim(), for example.  Assuming
> normal data.  Rash assumption, but that's not the issue here.
>
> But the thing is, I also want (later!) to include such things as
> a school effect (there are 6 different schools), a sex effect, and
> an ethnicity effect.
>
> Things start to get complicated --- sounds like a job for lmer().
>
> So I'd just like to get a toe in the door by fitting the trivial
> (square 0) model with lmer() --- and then if I can get my head
> around that, move on to the two reduced models.  I.e. I'd like to
> reproduce my simple-minded computations using lmer() --- which would
> give me a little bit of confidence that I'm driving lmer() correctly.
>
> *Can* the trivial model be fitted in lmer()?  I tried using
>
>        fit <- lmer(y ~ tstnum + (1|stdnt), data=schooldat)
>
> and got estimates of the coefficients for tstnum as follows:
>
> Fixed effects:
>            Estimate Std. Error t value
> (Intercept)  3.22917    0.09743   33.14
> tstnum2      0.46667    0.08461    5.52
> tstnum3      0.50000    0.08461    5.91
> tstnum4      0.83750    0.08461    9.90
> tstnum5      0.47083    0.08461    5.56
> tstnum6      0.97500    0.08461   11.52
>
> The mean of (the columns of) the data matrix is
>
> 3.229167 3.695833 3.729167 4.066667 3.700000 4.204167
>
> which is in exact agreement with the lmer() results when converted to
> the same parameterization (mu_i = mu + alpha_i, with alpha_1 = 0).
>
> (Notice the surprizing, depressing, and so far unexplained *drop*
> in the response over the second summer.)
>
> What I *don't* understand is the correlation structure of the estimates
> produced by lmer(), which is:
>
> Correlation of Fixed Effects:
>        (Intr) tstnm2 tstnm3 tstnm4 tstnm5
> tstnum2 -0.434
> tstnum3 -0.434  0.500
> tstnum4 -0.434  0.500  0.500
> tstnum5 -0.434  0.500  0.500  0.500
> tstnum6 -0.434  0.500  0.500  0.500  0.500
>
> So apparently the way I called lmer() places substantial constraints
> on the covariance structure.

That's the correlation matrix of the fixed-effects parameters.  You
should have separately gotten estimates of the variance-covariance of
the random effects, which you coyly did not show us :-).  Because you
are allowing only a simple, scalar random effect per student there
will be an estimate of the variance of this random effect and an
estimate of the residual variance.

> How can I (is there any way that I can)
> tell lmer() to fit the most general possible covariance structure?

It sounds like you want a model formula of

lmer(y ~ tstnum + (0 + tstnum|stdnt), data=schooldat)

but that model will have 21 variance-covariance terms to estimate (22
if you count the residual variance but that one gets profiled out of
the optimization).  I would not be surprised if the estimated
variance-covariance matrix for the random effects turns out to be
singular.

> As usual, advice, insight, tutelage humbly appreciated.

> If anyone wishes to experiment with the real data set (it's a bit
> too big to post here) I can make it available to them via email.

Generally I would jump at the chance but not with my "To Do" list in
its current, sadly over-committed, state.