Frank E Harrell Jr
f.harrell at vanderbilt.edu
Wed Jul 16 03:25:22 CEST 2008
Dylan Beaudette wrote:
> Hi,
>
> I am curious about how to interpret the table produced by
> anova(ols(...)), from the Design package. I have a multiple linear
> regression model, with some interaction, defined by:
>
> ols(formula = log(ksat * 60 * 60) ~ log(sar) * pol(activity,
> 3) + log(conc) * pol(sand, 3), data = sm.clean, x = TRUE,
> y = TRUE)
>
> n Model L.R. d.f. R2 Sigma
> 1834 1203 14 0.48 1.2
>
> Residuals:
> Min 1Q Median 3Q Max
> -5.033 -0.859 0.016 0.739 4.868
>
> Coefficients:
> Value Std. Error t Pr(>|t|)
> Intercept 11.3886790 2.0220171 5.63 0.0000000205580
> sar -4.3991263 1.0157588 -4.33 0.0000156609226
> activity -40.0591221 5.6907822 -7.04 0.0000000000027
> activity^2 33.0570116 5.0578520 6.54 0.0000000000819
> activity^3 -8.1645147 1.3750370 -5.94 0.0000000034548
> conc 0.3841260 0.0813200 4.72 0.0000024942478
> sand -0.0096212 0.0327415 -0.29 0.7689032898947
> sand^2 0.0008495 0.0008589 0.99 0.3227487169683
> sand^3 0.0000025 0.0000066 0.39 0.6994987342042
> sar * activity 12.8134698 2.9513942 4.34 0.0000149300007
> sar * activity^2 -9.9981381 2.6310765 -3.80 0.0001494462966
> sar * activity^3 2.1481278 0.7168339 3.00 0.0027662261037
> conc * sand -0.0157426 0.0076013 -2.07 0.0384966958735
> conc * sand^2 0.0003419 0.0001989 1.72 0.0857381555491
> conc * sand^3 -0.0000027 0.0000015 -1.77 0.0777025949762
>
>
> Looking at what I 'think' are "marginal p-values" i.e. results of a
> test against coef_i != 0, there are several terms with non-significant
> coefficients (at p<0.05). Does a non-significant coefficient warrant
> removal from the model, or perhaps a mention in the discussion?
No
>
> Compared to the above example, what tests are performed when calling
> anova() on this object? Here is the output in R:
Mark Difford gave a nice response for that.
Frank
>
> Analysis of Variance Response: log(ksat * 60 * 60)
>
> Factor d.f. Partial SS MS F
> sar (Factor+Higher Order Factors) 4 168.43 42.11 27.0
> All Interactions 3 142.13 47.38 30.4
> activity (Factor+Higher Order Factors) 6 536.84 89.47 57.3
> All Interactions 3 142.13 47.38 30.4
> Nonlinear (Factor+Higher Order Factors) 4 257.25 64.31 41.2
> conc (Factor+Higher Order Factors) 4 443.02 110.75 71.0
> All Interactions 3 76.74 25.58 16.4
> sand (Factor+Higher Order Factors) 6 1906.29 317.71 203.6
> All Interactions 3 76.74 25.58 16.4
> Nonlinear (Factor+Higher Order Factors) 4 263.00 65.75 42.1
> sar * activity (Factor+Higher Order Factors) 3 142.13 47.38 30.4
> Nonlinear 2 95.32 47.66 30.5
> Nonlinear Interaction : f(A,B) vs. AB 2 95.32 47.66 30.5
> conc * sand (Factor+Higher Order Factors) 3 76.74 25.58 16.4
> Nonlinear 2 4.98 2.49 1.6
> Nonlinear Interaction : f(A,B) vs. AB 2 4.98 2.49 1.6
> TOTAL NONLINEAR 8 455.20 56.90 36.5
> TOTAL INTERACTION 6 218.87 36.48 23.4
> TOTAL NONLINEAR + INTERACTION 10 573.36 57.34 36.7
> REGRESSION 14 2631.53 187.97 120.4
> ERROR 1819 2839.25 1.56
> P
> <.0001
> <.0001
> <.0001
> <.0001
> <.0001
> <.0001
> <.0001
> <.0001
> <.0001
> <.0001
> <.0001
> <.0001
> <.0001
> <.0001
> 0.203
> 0.203
> <.0001
> <.0001
> <.0001
> <.0001
>
> Are more of the 'terms' significant (at p<0.05) due to pooling of
> model terms? I have looked through Frank's book on the topic, but
> can't quite wrap my head around what the above is telling me. I am
> mostly interested in presenting a model for use as a applied tool, and
> interpretation of terms / interaction is very important.
>
> Thanks,
>
> Dylan
>
