[R-sig-ME] questions about parsimonious mixed models

Wed Oct 26 19:03:45 CEST 2016

In my opinion, LRTs are only meaningful for the comparison of
non-degenerate models. Thus, if extending the reduced model with
correlation parameters leads to overparameterization, I would stay with the
reduced model.

This is a quick response. I did not look at your example in detail.

On 26 Oct 2016, at 16:13, Wing Yee Chow <wingyeechow.zoey at gmail.com> wrote:

Hi there,

I tried to follow the suggestions in Bates et al.'s parsimonious mixed model
paper (link <https://arxiv.org/abs/1506.04967> ) and I'm having some
problems. Basically, I started with a maximal model (m0) and then a
zero-correlation-parameter model (m1), then I reduced the model by taking
out near-zero variance components (m2). Once I extended the reduced model
with correlation parameters (m3), the model becomes degenerate (some
correlation parameters are 1 or -1). However, the likelihood ratio test
(anova()) suggests that m3 provides a significantly better fit than m2. How
should I proceed? What would be the optimal model in cases like this?

I'm not sure if I did something wrong in the process, so I'll detail what I
did below.

Thanks!

Wing-Yee Chow

==================================

My experiment has a 2x2 within-participant design with 24 participants and
48 items. Both factors (Congruity and SentenceType) are categorical and I
specified the contrasts this way:

contrasts(tempdata$congruity) <- contr.sum(2)/2

contrasts(tempdata$sentencetype) <- contr.sum(2)/2

I started with a maximal model, and rePCA() suggests that it's
over-parameterised:

summary(m0 <- lmer(value ~ sentencetype * congruity + (sentencetype *

congruity|subj) + (sentencetype * congruity|item), REML=FALSE,

          data=tempdata))

Linear mixed model fit by maximum likelihood  ['lmerMod']

Formula: value ~ sentencetype * congruity + (sentencetype * congruity |
subj) + (sentencetype * congruity | item)

  Data: tempdata

    AIC      BIC   logLik deviance df.resid

14415.9  14540.2  -7182.9  14365.9     1045

Scaled residuals:

   Min      1Q  Median      3Q     Max

-2.6228 -0.6275 -0.1785  0.4124  6.1246

Random effects:

Groups   Name                     Variance Std.Dev. Corr

item     (Intercept)               5412.2   73.568

         sentencetype1             3316.3   57.588   0.92

         congruity1                 866.0   29.429   0.58  0.85

         sentencetype1:congruity1   138.1   11.752   0.51  0.15 -0.40

subj     (Intercept)              10789.8  103.874

         sentencetype1               56.2    7.497   1.00

         congruity1                 835.1   28.898  -1.00 -1.00

         sentencetype1:congruity1  2279.8   47.747   1.00  1.00 -1.00

Residual                          34511.5  185.773

Number of obs: 1070, groups:  item, 48; subj, 24

Fixed effects:

                        Estimate Std. Error t value

(Intercept)                388.84      24.39  15.944

sentencetype1               56.27      14.18   3.968

congruity1                 -18.19      13.51  -1.346

sentencetype1:congruity1   -20.90      24.82  -0.842

Correlation of Fixed Effects:

           (Intr) sntnc1 cngrt1

sentenctyp1  0.328

congruity1  -0.300  0.109

sntnctyp1:1  0.357  0.047 -0.186

summary(rePCA(m0))

$item

Importance of components:

                        [,1]    [,2]      [,3] [,4]

Standard deviation     0.5070 0.15787 1.228e-06    0

Proportion of Variance 0.9116 0.08838 0.000e+00    0

Cumulative Proportion  0.9116 1.00000 1.000e+00    1

$subj

Importance of components:

                       [,1]      [,2]      [,3] [,4]

Standard deviation     0.636 1.045e-06 1.895e-07    0

Proportion of Variance 1.000 0.000e+00 0.000e+00    0

Cumulative Proportion  1.000 1.000e+00 1.000e+00    1

Then, I converted the factors to numeric covariates using model.matrix() and
did the zero-correlation-parameter model:

summary(m1 <- lmer(value ~ sentencetype * congruity + (cSenttype *

cCongruity||subj) + (cSenttype * cCongruity||item), REML=FALSE,

          data=tempdata))

Linear mixed model fit by maximum likelihood  ['lmerMod']

Formula: value ~ sentencetype * congruity + ((1 | subj) + (0 + cSenttype |

   subj) + (0 + cCongruity | subj) + (0 + cSenttype:cCongruity |
subj)) + ((1 | item) + (0 + cSenttype | item) + (0 + cCongruity |

   item) + (0 + cSenttype:cCongruity | item))

  Data: tempdata

    AIC      BIC   logLik deviance df.resid

14419.9  14484.5  -7196.9  14393.9     1057

Scaled residuals:

   Min      1Q  Median      3Q     Max

-2.4709 -0.6110 -0.1823  0.4011  6.1156

Random effects:

Groups   Name                 Variance  Std.Dev.

item     cSenttype:cCongruity 0.000e+00 0.000e+00

item.1   cCongruity           0.000e+00 0.000e+00

item.2   cSenttype            2.478e+03 4.978e+01

item.3   (Intercept)          5.268e+03 7.258e+01

subj     cSenttype:cCongruity 0.000e+00 0.000e+00

subj.1   cCongruity           3.473e+01 5.893e+00

subj.2   cSenttype            3.819e-11 6.180e-06

subj.3   (Intercept)          1.079e+04 1.039e+02

Residual                      3.543e+04 1.882e+02

Number of obs: 1070, groups:  item, 48; subj, 24

Fixed effects:

                        Estimate Std. Error t value

(Intercept)                388.41      24.34  15.957

sentencetype1               57.23      13.59   4.210

congruity1                 -17.72      11.60  -1.527

sentencetype1:congruity1   -20.81      23.07  -0.902

Correlation of Fixed Effects:

           (Intr) sntnc1 cngrt1

sentenctyp1 -0.002

congruity1   0.000 -0.001

sntnctyp1:1  0.000 -0.002 -0.009

Once again rePCA() suggests that the model is over-parameterised:

summary(rePCA(m1))

$item

Importance of components:

                        [,1]   [,2] [,3] [,4]

Standard deviation     0.3856 0.2645    0    0

Proportion of Variance 0.6801 0.3200    0    0

Cumulative Proportion  0.6801 1.0000    1    1

$subj

Importance of components:

                        [,1]    [,2]      [,3] [,4]

Standard deviation     0.5518 0.03131 3.283e-08    0

Proportion of Variance 0.9968 0.00321 0.000e+00    0

Cumulative Proportion  0.9968 1.00000 1.000e+00    1

So then I reduced the model by taking out 4 variance components that are
zero/near-zero, and rePCA suggests that this reduced model is no longer
over-parameterised:

summary(m2<-lmer(value ~ sentencetype * congruity + (cCongruity||subj) +

(cSenttype||item),REML=FALSE, data=tempdata))

Linear mixed model fit by maximum likelihood  ['lmerMod']

Formula: value ~ sentencetype * congruity + ((1 | subj) + (0 + cCongruity |
subj)) + ((1 | item) + (0 + cSenttype | item))

  Data: tempdata

    AIC      BIC   logLik deviance df.resid

14411.9  14456.6  -7196.9  14393.9     1061

Scaled residuals:

   Min      1Q  Median      3Q     Max

-2.4709 -0.6110 -0.1823  0.4011  6.1156

Random effects:

Groups   Name        Variance Std.Dev.

item     cSenttype    2478.42  49.784

item.1   (Intercept)  5267.94  72.581

subj     cCongruity     34.73   5.894

subj.1   (Intercept) 10785.48 103.853

Residual             35428.11 188.224

Number of obs: 1070, groups:  item, 48; subj, 24

Fixed effects:

                        Estimate Std. Error t value

(Intercept)                388.41      24.34  15.957

sentencetype1               57.23      13.59   4.210

congruity1                 -17.72      11.60  -1.527

sentencetype1:congruity1   -20.81      23.07  -0.902

Correlation of Fixed Effects:

           (Intr) sntnc1 cngrt1

sentenctyp1 -0.002

congruity1   0.000 -0.001

sntnctyp1:1  0.000 -0.002 -0.009

summary(rePCA(m2))

$item

Importance of components:

                        [,1]   [,2]

Standard deviation     0.3856 0.2645

Proportion of Variance 0.6801 0.3200

Cumulative Proportion  0.6801 1.0000

$subj

Importance of components:

                        [,1]    [,2]

Standard deviation     0.5518 0.03131

Proportion of Variance 0.9968 0.00321

Cumulative Proportion  0.9968 1.00000

Then I followed Bates et al's guidelines in extending this reduced model
with correlation parameters:

summary(m3<-lmer(value ~ sentencetype * congruity + (cCongruity|subj) +

(cSenttype|item),REML=FALSE, data=tempdata))

Linear mixed model fit by maximum likelihood  ['lmerMod']

Formula: value ~ sentencetype * congruity + (cCongruity | subj) + (cSenttype
|      item)

  Data: tempdata

    AIC      BIC   logLik deviance df.resid

14395.2  14450.0  -7186.6  14373.2     1059

Scaled residuals:

   Min      1Q  Median      3Q     Max

-2.6509 -0.6383 -0.1679  0.4151  5.9618

Random effects:

Groups   Name        Variance Std.Dev. Corr

item     (Intercept)  5391.4   73.43

         cSenttype    3085.8   55.55   1.00

subj     (Intercept) 10803.3  103.94

         cCongruity    826.1   28.74   -1.00

Residual             35022.7  187.14

Number of obs: 1070, groups:  item, 48; subj, 24

Fixed effects:

                        Estimate Std. Error t value

(Intercept)                388.76      24.40  15.933

sentencetype1               56.88      13.99   4.066

congruity1                 -18.10      12.88  -1.405

sentencetype1:congruity1   -20.94      22.93  -0.913

Correlation of Fixed Effects:

           (Intr) sntnc1 cngrt1

sentenctyp1  0.247

congruity1  -0.396 -0.001

sntnctyp1:1  0.000 -0.002 -0.007

The correlation parameters of 1 and -1 suggest that this model is
degenerate. This is consistent with the results of rePCA():

summary(rePCA(m3))

$item

Importance of components:

                       [,1] [,2]

Standard deviation     0.492    0

Proportion of Variance 1.000    0

Cumulative Proportion  1.000    1

$subj

Importance of components:

                        [,1]      [,2]

Standard deviation     0.5762 1.678e-07

Proportion of Variance 1.0000 0.000e+00

Cumulative Proportion  1.0000 1.000e+00

However, the likelihood ration test suggests that m3 provides a
significantly better fit than m2:

anova(m2, m3)

Data: tempdata

Models:

m2: value ~ sentencetype * congruity + ((1 | subj) + (0 + cCongruity |

m2:     subj)) + ((1 | item) + (0 + cSenttype | item))

m3: value ~ sentencetype * congruity + (cCongruity | subj) + (cSenttype |

m3:     item)

  Df   AIC   BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)

m2  9 14412 14457 -7196.9    14394

m3 11 14395 14450 -7186.6    14373 20.639      2  3.298e-05 ***

---

Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Questions: Should I have reduced the model further before extending it with
correlation parameters? How should I proceed (or what should I have done
differently)?

[[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

	[[alternative HTML version deleted]]