[R] Why can repeated measures anova with within & between subjects design not be done if group sizes are unbalanced?
Yuelin Li
liy12 at mskcc.org
Thu Nov 8 16:52:58 CET 2007
Hope I am not too late joining this thread. I believe the difference
between R and SPSS is because SPSS adjusts the Type III SS by the
harmonic mean of the unbalanced cell sizes. This calculation is
discussed in Maxwell and Delaney (1990, pp. 271-297).
In short, the best explanation I can offer (details see below) is that
SPSS and R produces the same output if you tell SPSS to do SSTYPE(1)
or SSTYPE(2) instead of the default SSTYPE(3). As discussed in
Maxwell and Delaney, the calculations of SS1 and SS2 do not involve
the harmonic mean. Maxwell and Delaney discussed the pros and cons of
each type of Sums of Squares. Apparently SPSS thinks that the
harmonic mean SS3 is the *right* analysis. Like people who responded
before me, I'd also suggest the use of lme() in unbalanced designs.
Yuelin.
---- details -------
I used the Hays.df data:
http://www.psych.upenn.edu/~baron/rpsych/rpsych.html
And I added one between-subject variable:
Hays.df$grpuneven <- c(1,1,1,1,1,1,1,1,2,2,2,2) # n=8 in grp 1; 4 in grp 2
I ran aov(rt ~ grpuneven*color*shape + Error(subj/shape+color), data=Hays.df)
which gives you the same output as SSTYPE(1) and
SSTYPE(2) using this syntax in SPSS.
GLM
Sh1Col1 Sh2Col1 Sh1Col2 Sh2Col2 BY grpuneven
/WSFACTOR = color 2 Polynomial shape 2 Polynomial
/METHOD = SSTYPE(2)
/CRITERIA = ALPHA(.05)
/WSDESIGN = color shape color*shape
/DESIGN = grpuneven .
-- Gilbert G wrote --|Sun (Nov/04/2007)[04:34]|--:
Dear R people:
I wish to switch from SPSS to R, but there is one particular type of
ANOVA design that cannot be done in R. Or more likely, it can be
done, but it is nowhere documented.
[... snip ...]
Now, in R you would have something like, as anybody who does balanced
repeated measures anova's might know:
aov( RT ~ color * shape * MyGroup + Error( Subjects/( color*shape) )
In spss you would have something like this (of course with the data
organized slightly differently :
GLM
x1 x2 x3 x4 BY MyGroup
/WSFACTOR = color 2 Polynomial shape 2 Polynomial
/METHOD = SSTYPE(3)
/CRITERIA = ALPHA(.05)
/WSDESIGN = color shape color*shape
/DESIGN = VAR00001 .
Ok, the question is. If the group sizes are different (say 10 people
in one group and 12 people in the other group) R is going to give the
wrong answer. Of course that is not R's fault.
BUT MY QUESTION IS: HOW TO GET THE UNBALANCED REPEATED MEASURES ANOVA RIGHT?
=====================================================================
Please note that this e-mail and any files transmitted with it may be
privileged, confidential, and protected from disclosure under
applicable law. If the reader of this message is not the intended
recipient, or an employee or agent responsible for delivering this
message to the intended recipient, you are hereby notified that any
reading, dissemination, distribution, copying, or other use of this
communication or any of its attachments is strictly prohibited. If
you have received this communication in error, please notify the
sender immediately by replying to this message and deleting this
message, any attachments, and all copies and backups from your
computer.
More information about the R-help
mailing list