[R-sig-ME] glmer: order of variables drastically influences SE

Tue Aug 6 09:33:18 CEST 2019

Hi Ben,

Thanks for your answer and the link.

I posted the data on figshare and linked it to my comment on the GitHub
issue.

Don't hesitate if you need more information.

Sincerely,

Guillaume ADEUX

Le lun. 5 août 2019 à 21:26, Ben Bolker <bbolker using gmail.com> a écrit :

>
>   Similar problems have come up before:
>
> https://github.com/lme4/lme4/issues/449
>
>   It would be great to have data to look at (best of all would be
> posting it somewhere and linking to it as a comment in the issue/URL
> listed above ...
>
> On 2019-08-05 4:57 a.m., Guillaume Adeux wrote:
> > Hello everyone,
> >
> > Could anyone explain to me how the order of variables in the glmer
> function
> > could affect the standard error of the estimates? The difference is
> drastic
> > in my case (see the two model outputs at the bottom of the email)...
> > The data I used comes from a 3 way factorial experiment (tillage,
> > N=nitrogen, CC=cover crop species) on which an additional covariate was
> > measured (dry_bio_cover_m2).
> > My main objective is to identify the drivers of weed biomass in different
> > cover crop.
> > I came across the situation doing the following:
> >
> > *1. fitting full model*
> >
> mod_full_CC=glmer(dry_bio_weeds_m2+0.001~block+year+scale(dry_bio_cover_m2)*tillage*N*CC+(1|block:tillage)+(1|block:tillage:N)+(1|block:tillage:N:CC)+(1|block:year)+(1|block:year:tillage)+(1|block:year:tillage:N)+(1|block:year:tillage:N:CC),family=gaussian(link="sqrt"),control=glmerControl(optimizer="nloptwrap",optCtrl=list(algorithm="NLOPT_LN_NELDERMEAD")),data=biomassCC_wo_C)
> >
> > *2. identifying the most parcimonious model*
> > dred_CC=dredge(mod_full_CC,rank="AICc",fixed=c("block","year"))
> >
> > The most parcimonious model resulting from dredge (i.e.
> get.models(dred_CC,
> > 1)[[1]] ) was:
> > dry_bio_weeds_m2 + 0.001 ~ *CC + N + scale(dry_bio_cover_m2) +  tillage +
> > block + year *+
> >     (1 | block:tillage) + (1 | block:tillage:N) + (1 |
> block:tillage:N:CC)
> > + (1 | block:year) + (1 | block:year:tillage) + (1 |
> block:year:tillage:N)
> > + (1 | block:year:tillage:N:CC) +
> >    * CC:N + CC:scale(dry_bio_cover_m2) + CC:tillage +
> > scale(dry_bio_cover_m2):tillage*
> >
> > If I refit this model in a different (more logical) order with the same
> > variables, as followed,
> > dry_bio_weeds_m2 + 0.001 ~* block + year + scale(dry_bio_cover_m2) +
> > tillage + N + CC + CC:N + CC:tillage + CC:scale(dry_bio_cover_m2) +
> > tillage:scale(dry_bio_cover_m2)* +
> > (1 | block:tillage) + (1 | block:tillage:N) + (1 | block:tillage:N:CC) +
> (1
> > | block:year) + (1 | block:year:tillage) + (1 | block:year:tillage:N) +
> (1
> > |  block:year:tillage:N:CC)
> > I obtain similar estimates but drastically different SE, which of course
> > impacts post-hoc analysis.
> >
> >
> > The condition number of both models is also slightly different (12.1 for
> > the model with the variables in the order returned by dredge and 14.3 for
> > the model with the "logical" order), although no severe collinearity was
> > detected.
> >
> >
> >
> > Model with variables fitted in the order given by dredge:
> >
> > Generalized linear mixed model fit by maximum likelihood (Laplace
> > Approximation) ['glmerMod']
> >  Family: gaussian  ( sqrt )
> > Formula: dry_bio_weeds_m2 + 0.001 ~ CC + N + scale(dry_bio_cover_m2) +
> >      tillage + block + year + (1 | block:tillage) + (1 |
> > block:tillage:N) +
> >     (1 | block:tillage:N:CC) + (1 | block:year) + (1 |
> > block:year:tillage) +      (1 | block:year:tillage:N) + (1 |
> > block:year:tillage:N:CC) +
> >     CC:N + CC:scale(dry_bio_cover_m2) + CC:tillage +
> > scale(dry_bio_cover_m2):tillage
> >    Data: biomassCC_wo_C
> > Control: glmerControl(optimizer = "nloptwrap", optCtrl =
> > list(algorithm = "NLOPT_LN_NELDERMEAD"))
> >
> >      AIC      BIC   logLik deviance df.resid
> >   6294.5   6416.4  -3116.2   6232.5      347
> >
> > Scaled residuals:
> >     Min      1Q  Median      3Q     Max
> > -4.7837 -0.5006 -0.0011  0.5075  4.7593
> >
> > Random effects:
> >  Groups                  Name        Variance  Std.Dev.
> >  block:year:tillage:N:CC (Intercept) 3.061e+04 1.750e+02
> >  block:tillage:N:CC      (Intercept) 2.162e-05 4.650e-03
> >  block:year:tillage:N    (Intercept) 2.645e-05 5.143e-03
> >  block:tillage:N         (Intercept) 2.125e+01 4.610e+00
> >  block:year:tillage      (Intercept) 4.656e-03 6.823e-02
> >  block:year              (Intercept) 3.848e+02 1.962e+01
> >  block:tillage           (Intercept) 4.204e-04 2.050e-02
> >  Residual                            4.505e+03 6.712e+01
> > Number of obs: 378, groups:
> > block:year:tillage:N:CC, 192; block:tillage:N:CC, 96;
> > block:year:tillage:N, 64; block:tillage:N, 32; block:year:tillage, 16;
> > block:year, 8; block:tillage, 8
> >
> > Fixed effects:
> >                                    Estimate Std. Error   t value Pr(>|z|)
> > (Intercept)                       13.829033   0.001353 10219.154  <
> 2e-16 ***
> > CCVv                              -0.064097   0.001353   -47.367  <
> 2e-16 ***
> > CCTs                               0.059189   0.001353    43.739  <
> 2e-16 ***
> > N.L                                5.506568   0.001310  4202.646  <
> 2e-16 ***
> > N.Q                               -0.387064   0.001310  -295.407  <
> 2e-16 ***
> > N.C                               -0.123713   0.001310   -94.417  <
> 2e-16 ***
> > scale(dry_bio_cover_m2)           -3.096732   0.001309 -2366.243  <
> 2e-16 ***
> > tillageRT                         -0.034498   0.001310   -26.329  <
> 2e-16 ***
> > blockR2                           -1.369223   0.464429    -2.948
>  0.0032 **
> > blockR3                           -0.897212   0.001353  -663.006  <
> 2e-16 ***
> > blockR4                            0.536253   0.464366     1.155   0.2482
> > year2014                           0.484093   0.350696     1.380   0.1675
> > CCVv:N.L                          -3.795768   0.652249    -5.820
> 5.90e-09 ***
> > CCTs:N.L                          -3.760854   0.652139    -5.767
> 8.07e-09 ***
> > CCVv:N.Q                          -0.630917   0.652215    -0.967   0.3334
> > CCTs:N.Q                          -0.258428   0.652139    -0.396   0.6919
> > CCVv:N.C                           0.857098   0.652113     1.314   0.1887
> > CCTs:N.C                           0.273846   0.652088     0.420   0.6745
> > CCVv:scale(dry_bio_cover_m2)      -2.834524   0.001310 -2163.350  <
> 2e-16 ***
> > CCTs:scale(dry_bio_cover_m2)      -2.564852   0.001401 -1831.105  <
> 2e-16 ***
> > CCVv:tillageRT                     4.446872   0.001401  3174.709  <
> 2e-16 ***
> > CCTs:tillageRT                     2.758362   0.001401  1969.252  <
> 2e-16 ***
> > scale(dry_bio_cover_m2):tillageRT  2.022118   0.001310  1543.318  <
> 2e-16 ***
> >
> >
> >
> >
> >
> >
> >
> >
> > Model with variables fitted in "logical" order:
> >
> > Generalized linear mixed model fit by maximum likelihood (Laplace
> > Approximation) ['glmerMod']
> >  Family: gaussian  ( sqrt )
> > Formula: dry_bio_weeds_m2 + 0.001 ~ block + year +
> scale(dry_bio_cover_m2) +
> >     tillage + N + CC + CC:N + CC:tillage + CC:scale(dry_bio_cover_m2)
> > +      tillage:scale(dry_bio_cover_m2) + (1 | block:tillage) + (1 |
> >     block:tillage:N) + (1 | block:tillage:N:CC) + (1 | block:year) +
> >    (1 | block:year:tillage) + (1 | block:year:tillage:N) + (1 |
> >     block:year:tillage:N:CC)
> >    Data: biomassCC_wo_C
> > Control: glmerControl(optimizer = "nloptwrap", optCtrl =
> > list(algorithm = "NLOPT_LN_NELDERMEAD"))
> >
> >      AIC      BIC   logLik deviance df.resid
> >   6294.5   6416.4  -3116.2   6232.5      347
> >
> > Scaled residuals:
> >     Min      1Q  Median      3Q     Max
> > -4.7838 -0.5009 -0.0011  0.5075  4.7594
> >
> > Random effects:
> >  Groups                  Name        Variance  Std.Dev.
> >  block:year:tillage:N:CC (Intercept) 3.061e+04 1.749e+02
> >  block:tillage:N:CC      (Intercept) 2.141e-05 4.627e-03
> >  block:year:tillage:N    (Intercept) 2.605e-05 5.104e-03
> >  block:tillage:N         (Intercept) 2.106e+01 4.589e+00
> >  block:year:tillage      (Intercept) 4.487e-03 6.699e-02
> >  block:year              (Intercept) 3.822e+02 1.955e+01
> >  block:tillage           (Intercept) 4.228e-04 2.056e-02
> >  Residual                            4.504e+03 6.711e+01
> > Number of obs: 378, groups:
> > block:year:tillage:N:CC, 192; block:tillage:N:CC, 96;
> > block:year:tillage:N, 64; block:tillage:N, 32; block:year:tillage, 16;
> > block:year, 8; block:tillage, 8
> >
> > Fixed effects:
> >                                   Estimate Std. Error t value Pr(>|z|)
> > (Intercept)                       13.82995   42.86352   0.323  0.74696
> > blockR2                           -1.36672   40.78008  -0.034  0.97326
> > blockR3                           -0.89843   40.78048  -0.022  0.98242
> > blockR4                            0.53355   40.77879   0.013  0.98956
> > year2014                           0.48290   28.79334   0.017  0.98662
> > scale(dry_bio_cover_m2)           -3.09673    0.81313  -3.808  0.00014
> ***
> > tillageRT                         -0.03537   43.77607  -0.001  0.99936
> > N.L                                5.50829   43.77552   0.126  0.89987
> > N.Q                               -0.38899   43.77300  -0.009  0.99291
> > N.C                               -0.12513   43.77337  -0.003  0.99772
> > CCVv                              -0.06447   43.74925  -0.001  0.99882
> > CCTs                               0.05444   43.75387   0.001  0.99901
> > N.L:CCVv                          -3.79754   61.86354  -0.061  0.95105
> > N.Q:CCVv                          -0.62820   61.86055  -0.010  0.99190
> > N.C:CCVv                           0.86704   61.85967   0.014  0.98882
> > N.L:CCTs                          -3.76680   61.86056  -0.061  0.95145
> > N.Q:CCTs                          -0.25653   61.85857  -0.004  0.99669
> > N.C:CCTs                           0.27781   61.85861   0.004  0.99642
> > tillageRT:CCVv                     4.44671   61.86316   0.072  0.94270
> > tillageRT:CCTs                     2.75968   61.86792   0.045  0.96442
> > scale(dry_bio_cover_m2):CCVv      -2.83780    0.93223  -3.044  0.00233 **
> > scale(dry_bio_cover_m2):CCTs      -2.56887    1.32152  -1.944  0.05191 .
> > scale(dry_bio_cover_m2):tillageRT  2.02139    0.85983   2.351  0.01873 *
> >
> >
> > I can share the data if need be. Thank you very much for your help.
> >
> > Sincerely,
> >
> > Guillaume ADEUX
> >
> >       [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > R-sig-mixed-models using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >
>
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

	[[alternative HTML version deleted]]