[R] lmer4 and variable selection
Andreas Nord
andreas.nord at zooekol.lu.se
Mon Aug 25 18:21:33 CEST 2008
Dear list,
I am currently working with a rather large data set on body temperature
regulation in wintering birds. My original model contains quite a few
dependent variables, but I do not (of course) wish to keep them all in my
final model. I've fitted the following model to the data:
> temp.lme1<-lmer(T.B~tarsus+wing+weight+factor(age)+factor(sex)+fat+minsunset+day1oct+day1oct.2+minnight+ave.day+minnight.1+T.A+ave.night.1+(1|ID)+(1|sign),data=bodytemp.df)
where T.B equals body temperature; explanatories are a number of biometric
measures (tarsus, wing, weight, fat, age, sex) and various measures of
ambient temperature (ave.day, minnight.1, minnight, ave.night.1, T.A) and
time/date (minsunset,day1oct,day1oct.2). Random factors are ID (individuals
were samples ranging from 1 to 3 times) and sign (person performing
measurements; 2 levels).
Model output looks like this:
> summary(temp.lme1)
Linear mixed model fit by REML
Formula: T.B ~ tarsus + wing + weight + factor(age) + factor(sex) + fat +
minsunset + day1oct + day1oct.2 + minnight + ave.day + minnight.1 + T.A
+ ave.night.1 + (1 | ID) + (1 | sign)
Data: bodytemp.df
AIC BIC logLik deviance REMLdev
557.8 614 -260.9 441 521.8
Random effects:
Groups Name Variance Std.Dev.
ID (Intercept) 1.0399e-01 0.32247096
sign (Intercept) 6.2663e-08 0.00025033
Residual 8.0162e-01 0.89533134
Number of obs: 167, groups: ID, 124; sign, 2
Fixed effects:
Estimate Std. Error t value
(Intercept) 4.124e+01 4.104e+00 10.049
tarsus -5.925e-02 5.801e-02 -1.021
wing -6.252e-02 4.984e-02 -1.254
weight 1.499e-01 1.446e-01 1.037
factor(age)2K+ 1.981e-01 1.651e-01 1.200
factor(sex)M 9.232e-02 2.146e-01 0.430
fat -2.297e-02 8.150e-02 -0.282
minsunset -1.104e-03 1.043e-03 -1.058
day1oct -4.247e-03 2.879e-02 -0.148
day1oct.2 5.087e-05 1.560e-04 0.326
minnight -5.987e-02 7.022e-02 -0.853
ave.day 1.128e-01 1.582e-01 0.713
minnight.1 -9.590e-02 1.684e-01 -0.570
T.A -4.855e-02 5.185e-02 -0.936
ave.night.1 1.420e-01 2.477e-01 0.573
Correlation of Fixed Effects:
(Intr) tarsus wing weight f()2K+ fct()M fat mnsnst day1ct
dy1c.2 mnnght ave.dy mnng.1 T.A
tarsus -0.851
wing -0.870 0.966
weight 0.071 -0.417 -0.411
factr(g)2K+ 0.211 -0.248 -0.241 0.219
factor(sx)M 0.573 -0.499 -0.526 -0.179 0.105
fat -0.037 0.046 0.052 -0.264 -0.152 0.045
minsunset -0.177 -0.144 -0.122 0.214 -0.101 -0.027 -0.045
day1oct -0.261 -0.051 -0.052 -0.117 -0.145 0.140 0.131 0.515
day1oct.2 0.257 0.050 0.051 0.121 0.141 -0.149 -0.125 -0.484 -0.993
minnight -0.074 0.249 0.216 -0.271 -0.032 -0.043 0.022 0.022 -0.168
0.231
ave.day -0.025 0.070 0.050 0.001 0.045 -0.022 0.046 -0.363 -0.120
0.041 -0.415
minnight.1 0.304 -0.081 -0.045 0.069 0.129 0.012 -0.054 -0.349 -0.636
0.644 0.023 0.052
T.A 0.049 -0.043 0.018 0.130 0.040 -0.164 -0.065 -0.317 -0.288
0.249 -0.598 0.267 0.143
ave.night.1 -0.234 0.004 -0.015 -0.030 -0.110 0.016 0.031 0.493 0.614
-0.586 0.105 -0.524 -0.863 -0.243
At this point, I want to go on selecting the variables with most explanatory
power to come up with a final model. However, I'm not sure on how to do
this, because (not being a trained statistician) I'm used to having p-values
to guide me. Similarly, I would like to be able to report the relative
"importance" of variables in some way but, as apparent from a number of
threads, p-values seem to be the least preferred option when it comes to
lmer. I've read about the mcmcsamp()-function, but I'm not entirely sure on
how to use it or on how to intrepret the output.
Any advice would be most appreciated.
Kind regards,
Andreas Nord
--
View this message in context: http://www.nabble.com/lmer4-and-variable-selection-tp19146850p19146850.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list