[R] meaning of tests presented in anova(ols(...)) {Design package}
Dylan Beaudette
dylan.beaudette at gmail.com
Tue Jul 15 06:34:33 CEST 2008
Hi,
I am curious about how to interpret the table produced by
anova(ols(...)), from the Design package. I have a multiple linear
regression model, with some interaction, defined by:
ols(formula = log(ksat * 60 * 60) ~ log(sar) * pol(activity,
3) + log(conc) * pol(sand, 3), data = sm.clean, x = TRUE,
y = TRUE)
n Model L.R. d.f. R2 Sigma
1834 1203 14 0.48 1.2
Residuals:
Min 1Q Median 3Q Max
-5.033 -0.859 0.016 0.739 4.868
Coefficients:
Value Std. Error t Pr(>|t|)
Intercept 11.3886790 2.0220171 5.63 0.0000000205580
sar -4.3991263 1.0157588 -4.33 0.0000156609226
activity -40.0591221 5.6907822 -7.04 0.0000000000027
activity^2 33.0570116 5.0578520 6.54 0.0000000000819
activity^3 -8.1645147 1.3750370 -5.94 0.0000000034548
conc 0.3841260 0.0813200 4.72 0.0000024942478
sand -0.0096212 0.0327415 -0.29 0.7689032898947
sand^2 0.0008495 0.0008589 0.99 0.3227487169683
sand^3 0.0000025 0.0000066 0.39 0.6994987342042
sar * activity 12.8134698 2.9513942 4.34 0.0000149300007
sar * activity^2 -9.9981381 2.6310765 -3.80 0.0001494462966
sar * activity^3 2.1481278 0.7168339 3.00 0.0027662261037
conc * sand -0.0157426 0.0076013 -2.07 0.0384966958735
conc * sand^2 0.0003419 0.0001989 1.72 0.0857381555491
conc * sand^3 -0.0000027 0.0000015 -1.77 0.0777025949762
Looking at what I 'think' are "marginal p-values" i.e. results of a
test against coef_i != 0, there are several terms with non-significant
coefficients (at p<0.05). Does a non-significant coefficient warrant
removal from the model, or perhaps a mention in the discussion?
Compared to the above example, what tests are performed when calling
anova() on this object? Here is the output in R:
Analysis of Variance Response: log(ksat * 60 * 60)
Factor d.f. Partial SS MS F
sar (Factor+Higher Order Factors) 4 168.43 42.11 27.0
All Interactions 3 142.13 47.38 30.4
activity (Factor+Higher Order Factors) 6 536.84 89.47 57.3
All Interactions 3 142.13 47.38 30.4
Nonlinear (Factor+Higher Order Factors) 4 257.25 64.31 41.2
conc (Factor+Higher Order Factors) 4 443.02 110.75 71.0
All Interactions 3 76.74 25.58 16.4
sand (Factor+Higher Order Factors) 6 1906.29 317.71 203.6
All Interactions 3 76.74 25.58 16.4
Nonlinear (Factor+Higher Order Factors) 4 263.00 65.75 42.1
sar * activity (Factor+Higher Order Factors) 3 142.13 47.38 30.4
Nonlinear 2 95.32 47.66 30.5
Nonlinear Interaction : f(A,B) vs. AB 2 95.32 47.66 30.5
conc * sand (Factor+Higher Order Factors) 3 76.74 25.58 16.4
Nonlinear 2 4.98 2.49 1.6
Nonlinear Interaction : f(A,B) vs. AB 2 4.98 2.49 1.6
TOTAL NONLINEAR 8 455.20 56.90 36.5
TOTAL INTERACTION 6 218.87 36.48 23.4
TOTAL NONLINEAR + INTERACTION 10 573.36 57.34 36.7
REGRESSION 14 2631.53 187.97 120.4
ERROR 1819 2839.25 1.56
P
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
0.203
0.203
<.0001
<.0001
<.0001
<.0001
Are more of the 'terms' significant (at p<0.05) due to pooling of
model terms? I have looked through Frank's book on the topic, but
can't quite wrap my head around what the above is telling me. I am
mostly interested in presenting a model for use as a applied tool, and
interpretation of terms / interaction is very important.
Thanks,
Dylan
More information about the R-help
mailing list