[R] testing two-factor anova effects using model comparison approach with lm() and anova()
Greg Snow
Greg.Snow at imail.org
Fri Feb 27 19:30:46 CET 2009
Notice the degrees of freedom as well in the different models.
With factors A and B, the 2 models:
A + B + A:B
And
A + A:B
Are actually the same overall model, just different parameterizations (you can also see this by using x=TRUE in the call to lm and looking at the x matrix used).
Testing if the main effect A should be in the model given that the interaction is in the model does not make sense in most cases, therefore the notation gives a different parameterization rather than the generally uninteresting test.
--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Paul Gribble
> Sent: Friday, February 27, 2009 11:01 AM
> To: r-help at r-project.org
> Subject: [R] testing two-factor anova effects using model comparison
> approach with lm() and anova()
>
> I wonder if someone could explain the behavior of the anova() and lm()
> functions in the following situation:
>
> I have a standard 3x2 factorial design, factorA has 3 levels, factorB
> has 2
> levels, they are fully crossed. I have a dependent variable DV.
>
> Of course I can do the following to get the usual anova table:
>
> > anova(lm(DV~factorA+factorB+factorA:factorB))
> Analysis of Variance Table
>
> Response: DV
> Df Sum Sq Mean Sq F value Pr(>F)
> factorA 2 7.4667 3.7333 4.9778 0.015546 *
> factorB 1 2.1333 2.1333 2.8444 0.104648
> factorA:factorB 2 9.8667 4.9333 6.5778 0.005275 **
> Residuals 24 18.0000 0.7500
>
> This is perfectly satisfactory for my situation, but as a pedagogical
> exercise, I wanted to demonstrate the model comparison approach to
> analysis
> of variance by using anova() to compare a full model that contains all
> effects, to restricted models that contain all effects save for the
> effect
> of interest.
>
> The test of the interaction effect seems to be as I expected:
>
> > fullmodel<-lm(DV~factorA+factorB+factorA:factorB)
> > restmodel<-lm(DV~factorA+factorB)
> > anova(fullmodel,restmodel)
> Analysis of Variance Table
>
> Model 1: DV ~ factorA + factorB + factorA:factorB
> Model 2: DV ~ factorA + factorB
> Res.Df RSS Df Sum of Sq F Pr(>F)
> 1 24 18.0000
> 2 26 27.8667 -2 -9.8667 6.5778 0.005275 **
>
> As you can see the value of F (6.5778) is the same as in the anova
> table
> above. All is well.
>
> However, if I try to test a main effect, e.g. factorA, by testing the
> full
> model against a restricted model that doesn't contain the main effect
> factorA, I get something strange:
>
> > restmodel<-lm(DV~factorB+factorA:factorB)
> > anova(fullmodel,restmodel)
> Analysis of Variance Table
>
> Model 1: DV ~ factorA + factorB + factorA:factorB
> Model 2: DV ~ factorB + factorA:factorB
> Res.Df RSS Df Sum of Sq F Pr(>F)
> 1 24 18
> 2 24 18 0 0
>
> upon inspection of each model I see that the Residuals are identical,
> which
> is not what I was expecting:
>
> > anova(fullmodel)
> Analysis of Variance Table
>
> Response: DV
> Df Sum Sq Mean Sq F value Pr(>F)
> factorA 2 7.4667 3.7333 4.9778 0.015546 *
> factorB 1 2.1333 2.1333 2.8444 0.104648
> factorA:factorB 2 9.8667 4.9333 6.5778 0.005275 **
> Residuals 24 18.0000 0.7500
>
> This looks fine, but then the restricted model is where things are not
> as I
> expected:
>
> > anova(restmodel)
> Analysis of Variance Table
>
> Response: DV
> Df Sum Sq Mean Sq F value Pr(>F)
> factorB 1 2.1333 2.1333 2.8444 0.104648
> factorB:factorA 4 17.3333 4.3333 5.7778 0.002104 **
> Residuals 24 18.0000 0.7500
>
> I was expecting the Residuals in the restricted model (the one not
> containing main effect of factorA) to be larger than in the full model
> containing all three effects. In other words, the variance accounted
> for by
> the main effect factorA should be added to the Residuals. Instead, it
> looks
> like the variance accounted for by the main effect of factorA is being
> soaked up by the factorA:factorB interaction term. Strangely, the
> degrees of
> freedom are also affected.
>
> I must be misunderstanding something here. Can someone point out what
> is
> happening?
>
> Thanks,
>
> -Paul
>
> --
> Paul L. Gribble, Ph.D.
> Associate Professor
> Dept. Psychology
> The University of Western Ontario
> London, Ontario
> Canada N6A 5C2
> Tel. +1 519 661 2111 x82237
> Fax. +1 519 661 3961
> pgribble at uwo.ca
> http://gribblelab.org
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list