[R] lm coefficients output confusing

Ross Culloch ross.culloch at dur.ac.uk
Thu Aug 13 22:45:31 CEST 2009

Hi all,

I have an issue with the lm() function regarding the listing of the
coefficients. My data are below, showing a list of hours (HR) relating to
the time spent resting (R) by an individual animal. Simply i want to run a
lm() to run in an anova() to see if there is a significant difference in
resting between hours. 

   HR         R
1   2 0.6666667
2   2 0.4666667
3   2 0.8000000
4   2 0.6333333
5   2 0.7333333
6   2 0.8000000
7   2 0.8666667
8   2 0.7857143
9   2 0.7826087
10  2 0.6666667
11  2 0.9166667
12  2 0.6666667
13  3 0.5294118
14  3 0.8541667
15  3 0.4583333
16  3 0.5882353
17  3 0.9347826
18  3 0.7878788
19  3 0.7857143
20  3 0.6944444
21  3 0.8333333
22  3 0.7450980
23  3 0.9230769
24  3 0.7222222
25  4 0.6571429
26  4 0.7241379
27  4 0.7391304
28  4 0.6571429
29  4 0.8000000
30  4 0.9130435
31  4 0.7187500
32  4 0.8437500
33  4 0.9230769
34  4 0.8571429
35  4 0.8695652
36  4 0.8888889
37  5 0.3333333
38  5 0.5365854
39  5 0.6774194
40  5 0.7142857
41  5 0.6904762
42  5 0.5483871
43  5 0.5952381
44  5 0.4166667
45  5 0.5666667
46  5 0.5952381
47  5 0.7894737
48  5 0.7500000
49  6 0.6268657
50  6 0.7187500
51  6 0.5500000
52  6 0.7164179
53  6 0.7656250
54  6 0.5869565
55  6 0.7164179
56  6 0.7031250
57  6 0.7230769
58  6 0.7462687
59  6 0.9200000
60  6 0.8536585
61  7 0.6379310
62  7 0.5357143
63  7 0.5227273
64  7 0.8000000
65  7 0.6724138
66  7 0.7083333
67  7 0.7241379
68  7 0.6938776
69  7 0.6545455
70  7 0.7931034
71  7 0.7560976
72  7 0.8684211
73  8 0.6727273
74  8 0.6000000
75  8 0.8333333
76  8 0.8181818
77  8 0.7818182
78  8 0.7647059
79  8 0.5818182
80  8 0.5918367
81  8 0.7450980
82  8 0.7818182
83  8 0.8048780
84  8 0.8684211

The script i'm using and output is as follows:

> anova(rdayml <- lm(R ~ HR, data=rdata2, na.action=na.exclude)) 
Analysis of Variance Table

Response: R
          Df  Sum Sq Mean Sq F value  Pr(>F)   
HR         6 0.25992 0.04332  3.1762 0.00774 **
Residuals 77 1.05021 0.01364                   
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
> summary(rdayml <- lm(R ~ HR,data=rdata2))

lm(formula = R ~ HR, data = rdata2)

      Min        1Q    Median        3Q       Max 
-0.279725 -0.065416  0.005593  0.077486  0.201070 

             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.732082   0.033713  21.715   <2e-16 ***
HR3          0.005976   0.047678   0.125   0.9006    
HR4          0.067232   0.047678   1.410   0.1625    
HR5         -0.130935   0.047678  -2.746   0.0075 ** 
HR6         -0.013152   0.047678  -0.276   0.7834    
HR7         -0.034807   0.047678  -0.730   0.4676    
HR8          0.004971   0.047678   0.104   0.9172    
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.1168 on 77 degrees of freedom
Multiple R-squared: 0.1984,     Adjusted R-squared: 0.1359 
F-statistic: 3.176 on 6 and 77 DF,  p-value: 0.00774 

What i really don't understand is why the lm summary lists the hour numbers
in the coefficient of the lm, as apposed to just reading HR? On top of that
if R does display the data like this then i don't understand why it omits
hour 2? If i can get this to work correctly can I use the p value to
determine which of the hours is significantly different to the others - so
in this example hour 5 is significantly different? Or is it just a case of
using the p value from the anova to determine that there is a significant
difference between hours (in this case) and use a plot to determine which
hour(s) are likely to be the cause?

Any help or advice would be most useful!

Best wishes,


View this message in context: http://www.nabble.com/lm-coefficients-output-confusing-tp24958398p24958398.html
Sent from the R help mailing list archive at Nabble.com.

More information about the R-help mailing list