[R] A concrete type I/III Sum of square problem
Peter Dalgaard
p.dalgaard at biostat.ku.dk
Thu Feb 16 11:55:47 CET 2006
Gregor Gorjanc <gregor.gorjanc at gmail.com> writes:
> > WPhantom <wp1 at tiscali.fr> writes:
> >
> >>> Thanks Brian for the reference.
> >>> I just discover that it is available in our
> >>> library so I going to take it & read it soon.
> >>> Actually, I don't even know the difference
> >>> between a multistratum vs a single-stratum AOV. A
> >>> quick search on google returned me the R materials so that I imagine
> >>> that these concepts are quite specific to R.
> >
> > You have to be careful not to confuse Google's view of the world with
> > Reality...
> >
> > The concept of error strata is much older than R, and existed for
> > instance in Genstat, anno 1977 or so. However, Genstat seems to have
> > left little impression on the Internet.
> >
> >>> I will read the book first before asking for more informations.
> >
> > The executive summary is that the concept of error strata relies
> > substantially on having a balanced design (at least for the random
> > effects), so that the analysis can be decomposed into analyses of
> > means, contrasts, and contrasts of means. For unbalanced designs, you
> > usually get meaningless analyses.
> >
>
> Can you (prof. Dalgaard) please point us to relevant book with these
> topics. I am very interested in it since my data are often unbalanced.
Hmm, the Danish tradition is highly based on lecture notes, so I don't
have a specific book for you. One possible starting point is
Tue Tjur (1984): Analysis of variance designs in orthogonal designs.
Int.Statist.Review 52, 33-81.
The thing to notice in relation to that paper is that the
decomposition (p.55) of the covariance matrix as sum(lambda_B Q_B^0)
is highly dependent on having an orthogonal design. Without the
orthogonality, it still defines a model, but typically one without a
sensible interpretation.
Look at a simple 1-way anova with three groups of equal size. The Q
matrices will be the projections P_X and I-P_X, where X is the design
matrix for the grouping factor, e.g.
> X <- model.matrix(~factor(rep(1:3,each=2)))
> X
(Intercept) factor(rep(1:3, each = 2))2 factor(rep(1:3, each = 2))3
1 1 0 0
2 1 0 0
3 1 1 0
4 1 1 0
5 1 0 1
6 1 0 1
...
P_X can be found in the following semi-secret way:
> P <- stats:::proj.matrix(X)
> P
1 2 3 4 5 6
1 0.5 0.5 0.0 0.0 0.0 0.0
2 0.5 0.5 0.0 0.0 0.0 0.0
3 0.0 0.0 0.5 0.5 0.0 0.0
4 0.0 0.0 0.5 0.5 0.0 0.0
5 0.0 0.0 0.0 0.0 0.5 0.5
6 0.0 0.0 0.0 0.0 0.5 0.5
Suppose we put a random component of 10 on P_X and 1 on (I-P_X).
We then get
> diag(6) - P + 10*P
1 2 3 4 5 6
1 5.5 4.5 0.0 0.0 0.0 0.0
2 4.5 5.5 0.0 0.0 0.0 0.0
3 0.0 0.0 5.5 4.5 0.0 0.0
4 0.0 0.0 4.5 5.5 0.0 0.0
5 0.0 0.0 0.0 0.0 5.5 4.5
6 0.0 0.0 0.0 0.0 4.5 5.5
which is a perfectly sensible covariance for within-group correlated
data.
Now try the same stunt with unbalanced data:
> X <- model.matrix(~factor(rep(1:3,1:3))-1)
> P <- stats:::proj.matrix(X)
> diag(6) - P + 10*P
1 2 3 4 5 6
1 10 0.0 0.0 0 0 0
2 0 5.5 4.5 0 0 0
3 0 4.5 5.5 0 0 0
4 0 0.0 0.0 4 3 3
5 0 0.0 0.0 3 4 3
6 0 0.0 0.0 3 3 4
I.e. we are de facto assuming that observations in the smaller group
have a larger variance than observations in the larger groups.
> >>> Thanks
> >>>
> >>> Sylvain Cl?ment
> >>>
> >>> At 12:38 14/02/2006, you wrote:
> >>
> >>>> >More to the point, you are confusing
> >>>> >multistratum AOV with single-stratuam AOV. For
> >>>> >a good tutorial, see MASS4 (bibliographic
> >>>> >information in the R FAQ). For unbalanced data
> >>>> >we suggest you use lme() instead.
>
> I do not have the whole book in my head as prof. Ripley probably does,
> but I can not recall to read about this in MASS4. I am sure I am wrong
> and would you (prof. Ripley) be please so kind and point us to relevant
> chapters/pages.
>
> Many thanks.
>
> --
> Lep pozdrav / With regards,
> Gregor Gorjanc
>
> ----------------------------------------------------------------------
> University of Ljubljana PhD student
> Biotechnical Faculty
> Zootechnical Department URI: http://www.bfro.uni-lj.si/MR/ggorjan
> Groblje 3 mail: gregor.gorjanc <at> bfro.uni-lj.si
>
> SI-1230 Domzale tel: +386 (0)1 72 17 861
> Slovenia, Europe fax: +386 (0)1 72 17 888
>
> ----------------------------------------------------------------------
> "One must learn by doing the thing; for though you think you know it,
> you have no certainty until you try." Sophocles ~ 450 B.C.
> ----------------------------------------------------------------------
>
--
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help
mailing list