[R] lmer4 and variable selection

Mon Aug 25 18:21:33 CEST 2008

Dear list, 

I am currently working with a rather large data set on body temperature
regulation in wintering birds. My original model contains quite a few
dependent variables, but I do not (of course) wish to keep them all in my
final model. I've fitted the following model to the data:

> temp.lme1<-lmer(T.B~tarsus+wing+weight+factor(age)+factor(sex)+fat+minsunset+day1oct+day1oct.2+minnight+ave.day+minnight.1+T.A+ave.night.1+(1|ID)+(1|sign),data=bodytemp.df)

where T.B equals body temperature; explanatories are a number of biometric
measures (tarsus,  wing, weight, fat, age, sex) and various measures of
ambient temperature (ave.day, minnight.1, minnight,  ave.night.1, T.A) and
time/date (minsunset,day1oct,day1oct.2). Random factors are ID (individuals
were samples ranging from 1 to 3 times) and sign (person performing
measurements; 2 levels).

Model output looks like this:

> summary(temp.lme1)
Linear mixed model fit by REML 
Formula: T.B ~ tarsus + wing + weight + factor(age) + factor(sex) + fat +     
minsunset + day1oct + day1oct.2 + minnight + ave.day + minnight.1 +      T.A
+ ave.night.1 + (1 | ID) + (1 | sign) 
   Data: bodytemp.df 
   AIC BIC logLik deviance REMLdev
 557.8 614 -260.9      441   521.8
Random effects:
 Groups   Name        Variance   Std.Dev.  
 ID       (Intercept) 1.0399e-01 0.32247096
 sign     (Intercept) 6.2663e-08 0.00025033
 Residual             8.0162e-01 0.89533134
Number of obs: 167, groups: ID, 124; sign, 2

Fixed effects:
                 Estimate Std. Error t value
(Intercept)     4.124e+01  4.104e+00  10.049
tarsus         -5.925e-02  5.801e-02  -1.021
wing           -6.252e-02  4.984e-02  -1.254
weight          1.499e-01  1.446e-01   1.037
factor(age)2K+  1.981e-01  1.651e-01   1.200
factor(sex)M    9.232e-02  2.146e-01   0.430
fat            -2.297e-02  8.150e-02  -0.282
minsunset      -1.104e-03  1.043e-03  -1.058
day1oct        -4.247e-03  2.879e-02  -0.148
day1oct.2       5.087e-05  1.560e-04   0.326
minnight       -5.987e-02  7.022e-02  -0.853
ave.day         1.128e-01  1.582e-01   0.713
minnight.1     -9.590e-02  1.684e-01  -0.570
T.A            -4.855e-02  5.185e-02  -0.936
ave.night.1     1.420e-01  2.477e-01   0.573

Correlation of Fixed Effects:
            (Intr) tarsus wing   weight f()2K+ fct()M fat    mnsnst day1ct
dy1c.2 mnnght ave.dy mnng.1 T.A   
tarsus      -0.851                                                                                           
wing        -0.870  0.966                                                                                    
weight       0.071 -0.417 -0.411                                                                             
factr(g)2K+  0.211 -0.248 -0.241  0.219                                                                      
factor(sx)M  0.573 -0.499 -0.526 -0.179  0.105                                                               
fat         -0.037  0.046  0.052 -0.264 -0.152  0.045                                                        
minsunset   -0.177 -0.144 -0.122  0.214 -0.101 -0.027 -0.045                                                 
day1oct     -0.261 -0.051 -0.052 -0.117 -0.145  0.140  0.131  0.515                                          
day1oct.2    0.257  0.050  0.051  0.121  0.141 -0.149 -0.125 -0.484 -0.993                                   
minnight    -0.074  0.249  0.216 -0.271 -0.032 -0.043  0.022  0.022 -0.168 
0.231                            
ave.day     -0.025  0.070  0.050  0.001  0.045 -0.022  0.046 -0.363 -0.120 
0.041 -0.415                     
minnight.1   0.304 -0.081 -0.045  0.069  0.129  0.012 -0.054 -0.349 -0.636 
0.644  0.023  0.052              
T.A          0.049 -0.043  0.018  0.130  0.040 -0.164 -0.065 -0.317 -0.288 
0.249 -0.598  0.267  0.143       
ave.night.1 -0.234  0.004 -0.015 -0.030 -0.110  0.016  0.031  0.493  0.614
-0.586  0.105 -0.524 -0.863 -0.243

At this point, I want to go on selecting the variables with most explanatory
power to come up with a final model. However, I'm not sure on how to do
this, because (not being a trained statistician) I'm used to having p-values
to guide me. Similarly, I would like to be able to report the relative
"importance" of  variables in some way but, as apparent from a number of
threads, p-values seem to be the least preferred option when it comes to
lmer. I've read about the mcmcsamp()-function, but I'm not entirely sure on
how to use it or on how to intrepret the output. 

Any advice would be most appreciated.

Kind regards, 
Andreas Nord                   

-- 
View this message in context: http://www.nabble.com/lmer4-and-variable-selection-tp19146850p19146850.html
Sent from the R help mailing list archive at Nabble.com.