[R] lm coefficients output confusing
Ross Culloch
ross.culloch at dur.ac.uk
Thu Aug 13 22:45:31 CEST 2009
Hi all,
I have an issue with the lm() function regarding the listing of the
coefficients. My data are below, showing a list of hours (HR) relating to
the time spent resting (R) by an individual animal. Simply i want to run a
lm() to run in an anova() to see if there is a significant difference in
resting between hours.
HR R
1 2 0.6666667
2 2 0.4666667
3 2 0.8000000
4 2 0.6333333
5 2 0.7333333
6 2 0.8000000
7 2 0.8666667
8 2 0.7857143
9 2 0.7826087
10 2 0.6666667
11 2 0.9166667
12 2 0.6666667
13 3 0.5294118
14 3 0.8541667
15 3 0.4583333
16 3 0.5882353
17 3 0.9347826
18 3 0.7878788
19 3 0.7857143
20 3 0.6944444
21 3 0.8333333
22 3 0.7450980
23 3 0.9230769
24 3 0.7222222
25 4 0.6571429
26 4 0.7241379
27 4 0.7391304
28 4 0.6571429
29 4 0.8000000
30 4 0.9130435
31 4 0.7187500
32 4 0.8437500
33 4 0.9230769
34 4 0.8571429
35 4 0.8695652
36 4 0.8888889
37 5 0.3333333
38 5 0.5365854
39 5 0.6774194
40 5 0.7142857
41 5 0.6904762
42 5 0.5483871
43 5 0.5952381
44 5 0.4166667
45 5 0.5666667
46 5 0.5952381
47 5 0.7894737
48 5 0.7500000
49 6 0.6268657
50 6 0.7187500
51 6 0.5500000
52 6 0.7164179
53 6 0.7656250
54 6 0.5869565
55 6 0.7164179
56 6 0.7031250
57 6 0.7230769
58 6 0.7462687
59 6 0.9200000
60 6 0.8536585
61 7 0.6379310
62 7 0.5357143
63 7 0.5227273
64 7 0.8000000
65 7 0.6724138
66 7 0.7083333
67 7 0.7241379
68 7 0.6938776
69 7 0.6545455
70 7 0.7931034
71 7 0.7560976
72 7 0.8684211
73 8 0.6727273
74 8 0.6000000
75 8 0.8333333
76 8 0.8181818
77 8 0.7818182
78 8 0.7647059
79 8 0.5818182
80 8 0.5918367
81 8 0.7450980
82 8 0.7818182
83 8 0.8048780
84 8 0.8684211
The script i'm using and output is as follows:
> anova(rdayml <- lm(R ~ HR, data=rdata2, na.action=na.exclude))
Analysis of Variance Table
Response: R
Df Sum Sq Mean Sq F value Pr(>F)
HR 6 0.25992 0.04332 3.1762 0.00774 **
Residuals 77 1.05021 0.01364
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> summary(rdayml <- lm(R ~ HR,data=rdata2))
Call:
lm(formula = R ~ HR, data = rdata2)
Residuals:
Min 1Q Median 3Q Max
-0.279725 -0.065416 0.005593 0.077486 0.201070
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.732082 0.033713 21.715 <2e-16 ***
HR3 0.005976 0.047678 0.125 0.9006
HR4 0.067232 0.047678 1.410 0.1625
HR5 -0.130935 0.047678 -2.746 0.0075 **
HR6 -0.013152 0.047678 -0.276 0.7834
HR7 -0.034807 0.047678 -0.730 0.4676
HR8 0.004971 0.047678 0.104 0.9172
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.1168 on 77 degrees of freedom
Multiple R-squared: 0.1984, Adjusted R-squared: 0.1359
F-statistic: 3.176 on 6 and 77 DF, p-value: 0.00774
What i really don't understand is why the lm summary lists the hour numbers
in the coefficient of the lm, as apposed to just reading HR? On top of that
if R does display the data like this then i don't understand why it omits
hour 2? If i can get this to work correctly can I use the p value to
determine which of the hours is significantly different to the others - so
in this example hour 5 is significantly different? Or is it just a case of
using the p value from the anova to determine that there is a significant
difference between hours (in this case) and use a plot to determine which
hour(s) are likely to be the cause?
Any help or advice would be most useful!
Best wishes,
Ross
--
View this message in context: http://www.nabble.com/lm-coefficients-output-confusing-tp24958398p24958398.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list