[R] How do you report lmer results?
Luke Duncan
luke.mangaliso.duncan at gmail.com
Tue Jul 26 17:06:13 CEST 2011
Dear R-Gurus
I am a PhD student from South Africa working on chimpanzee behaviour.
I am looking at patterns of shade utilization and am using generalized
linear mixed models to examine the effects of various factors on
whether chimpanzees choose to spend time in the sun or shade. I
realise that the lme4 package and the outputs of the lmer functions
have been discussed ad nauseum but I have been reading through many of
them and am finding it all extremely confusing. I have used programs
like Statistica to run glm's with no random factors but now that I
have to include random effects, this is no longer an option. Thus I
have turned to R (and hence I am a complete R virgin).
What I would like to know is the following. What is the accepted
general consensus on how to report the outputs of a lmer model? What
is the currently accepted method for determining whether fixed effect
parameters are significant in predicting the outcomes of the model
(LHR, AIC, Wald X^2...?)? While I recognise that the "Pr(>|z|)" value
is not a definitive p-value (rather an approximation), can one treat
it loosely as an 'estimated' p-value?
My model comprises 2 categorical predictor variables (Time of day:
'Time'; Available amount of shade, coded as a three-way
classification: 'Tertile'), two continuous predictor variables
(maximum temperature: 'Max'; minimum temperature: 'Min') and three
random effects (Which experimental dataset the data were derived from:
'Exp'; Which individual chimpanzee was observed: 'Indiv'; Which
area/zone of the enclosure they occupied at the time of observation:
'Zone'). These are the outputs that I have generated thus far using
LHR testing. How should I be interpretting and reporting these
outputs?
Generalized linear mixed model fit by the Laplace approximation
Formula: prop ~ Time + Tertile + (1 | Exp) + (1 | Indiv) + (1 | Zone)
+ Max + Min
Data: sdata
AIC BIC logLik deviance
215.5 259 -95.77 191.5
Random effects:
Groups Name Variance Std.Dev.
Zone (Intercept) 2.6596e-01 5.1571e-01
Indiv (Intercept) 0.0000e+00 0.0000e+00
Exp (Intercept) 2.9021e-11 5.3871e-06
Number of obs: 276, groups: Zone, 8; Indiv, 7; Exp, 2
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.15725 1.58304 -1.363 0.17297
Time11h00 0.96362 0.40956 2.353 0.01863 *
Time12h00 1.57906 0.49033 3.220 0.00128 **
Time13h00 1.58951 0.40705 3.905 9.43e-05 ***
Time14h00 1.07939 0.53876 2.003 0.04513 *
TertileLOW -1.40906 0.53761 -2.621 0.00877 **
TertileMEDIUM -1.24862 0.57396 -2.175 0.02960 *
Max 0.10122 0.08611 1.175 0.23985
Min 0.13439 0.10292 1.306 0.19162
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) Tm1100 Tm1200 Tm1300 Tm1400 TrtLOW TMEDIU Max
Time11h00 0.056
Time12h00 0.258 0.447
Time13h00 0.115 0.510 0.486
Time14h00 -0.049 0.318 0.276 0.370
TertileLOW -0.146 -0.119 -0.215 -0.236 -0.096
TertlMEDIUM -0.128 0.024 -0.145 -0.155 -0.224 0.707
Max -0.914 -0.162 -0.277 -0.198 -0.084 -0.025 -0.022
Min 0.178 0.074 -0.023 0.105 0.244 -0.101 -0.077 -0.463
> anova(m1,m2)
Data: sdata
Models:
m2: prop ~ Time + (1 | Exp) + (1 | Indiv) + (1 | Zone) + Max + Min
m1: prop ~ Time + Tertile + (1 | Exp) + (1 | Indiv) + (1 | Zone) +
m1: Max + Min
Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)
m2 10 216.72 252.92 -98.359
m1 12 215.55 258.99 -95.773 5.1721 2 0.07532 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> anova(m1,m3)
Data: sdata
Models:
m3: prop ~ Tertile + (1 | Exp) + (1 | Indiv) + (1 | Zone) + Max +
m3: Min
m1: prop ~ Time + Tertile + (1 | Exp) + (1 | Indiv) + (1 | Zone) +
m1: Max + Min
Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)
m3 8 226.11 255.08 -105.057
m1 12 215.55 258.99 -95.773 18.567 4 0.0009556 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> anova(m1,m4)
Data: sdata
Models:
m4: prop ~ Time + Tertile + (1 | Exp) + (1 | Indiv) + (1 | Zone) +
m4: Min
m1: prop ~ Time + Tertile + (1 | Exp) + (1 | Indiv) + (1 | Zone) +
m1: Max + Min
Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)
m4 11 214.81 254.64 -96.407
m1 12 215.55 258.99 -95.773 1.2672 1 0.2603
> anova(m1,m5)
Data: sdata
Models:
m5: prop ~ Time + Tertile + (1 | Exp) + (1 | Indiv) + (1 | Zone) +
m5: Max
m1: prop ~ Time + Tertile + (1 | Exp) + (1 | Indiv) + (1 | Zone) +
m1: Max + Min
Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)
m5 11 215.22 255.05 -96.613
m1 12 215.55 258.99 -95.773 1.6792 1 0.195
As I understand this output, the only significant predictor in the
model appears to be time of day. But, I don't really know how this
should be reported. Can you point me to some papers or examples where
lmer outputs have been reported formally? Any help that you could
offer would be MOST appreciated.
Sincerely (in desperation)
Luke Duncan
PhD Candidate
School of Animal, Plant and Environmental Sciences
University of the Witwatersrand
Johannesburg, South Africa
+27 11 717 6452
More information about the R-help
mailing list