[R] Wrapper of linearHypothesis (car) for post-hoc of repeated measures ANOVA
John Fox
jfox at mcmaster.ca
Fri Sep 23 16:27:29 CEST 2011
Dear Helios,
I've now had a chance to look at your code for the factorltest.mlm() function. I agree that the function makes it easier to test hypotheses in repeated-measures ANOVAs. When I have some more time, I'll make a few suggestions (off list) for improving the user interface to the function.
Best,
John
--------------------------------
John Fox
Senator William McMaster
Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
> Behalf Of Helios de Rosario
> Sent: September-22-11 12:41 PM
> To: r-help at r-project.org
> Subject: [R] Wrapper of linearHypothesis (car) for post-hoc of repeated
> measures ANOVA
>
> For some time I have been looking for a convenient way of performing post-hoc
> analysis to Repeated Measures ANOVA, that would be acceptable if sphericity
> is violated (i.e. leaving aside post-hoc to lme models).
>
> The best solution I found was John Fox's proposal to similar requests in R-
> help:
> http://tolstoy.newcastle.edu.au/R/e2/help/07/09/26518.html
> http://tolstoy.newcastle.edu.au/R/e10/help/10/04/1663.html
>
> However, I think that using linearHypothesis() is not as straightforward as I
> would desire for testing specific contrasts between factor levels. The
> hypotheses must be defined as linear combinations of the model coefficients
> (subject to response transformations according to the intra-subjects design),
> which may need some involved calculations if one is thinking on differences
> between "this and that" factor levels (either between-subjects or intra-
> subjects), and the issue gets worse for post-hoc tests on interaction
> effects.
>
> For that reason, I have spent some time in writing a wrapper to
> linearHypothesis() that might be helpful in those situations. I copy the
> commented code at the end of this message, because although I have
> successfully used it in some cases, I would like more knowledgeable people
> to put it to test (and eventually help me create a worthwile contribution for
> other people that could find it useful).
>
> This function (which I have called "factorltest.mlm") needs the multivariate
> linear model and the intrasubject-related arguments (idata,
> idesign...) that would be passed on to Anova() (from car) for a repeated
> measures analysis, or directly the Anova.mlm object returned by Anova()
> instead of idata, idesign... (I have tried to explain it clearly in the
> commentaries to the code.)
>
> Moreover, it needs an argument "levelcomb": a list that represents the level
> combinations of factors to be tested. There are different ways of
> representing those combinations (through names of factor levels, or
> coefficient vectors/matrices), and depending on the elements of that list the
> test is made for main effects, simple effects, interaction contrasts, etc.
>
> For instance, let me use an example with the OBrienKaiser data set (as in the
> help documentation for Anova() and linearHypothesis()).
>
> The calculation of the multivariate linear model and Anova is copied from
> those help files:
>
> > phase <- factor(rep(c("pretest", "posttest", "followup"), c(5, 5,
> 5)),
> + levels=c("pretest", "posttest", "followup"))
> > hour <- ordered(rep(1:5, 3))
> > idata <- data.frame(phase, hour)
> > mod.ok <- lm(cbind(pre.1, pre.2, pre.3, pre.4, pre.5,
> + post.1, post.2, post.3, post.4, post.5,
> + fup.1, fup.2, fup.3, fup.4, fup.5) ~
> treatment*gender,
> + data=OBrienKaiser)
> > av.ok <- Anova(mod.ok, idata=idata, idesign=~phase*hour)
>
> Then, let's suppose that I want to test pairwise comparisons for the
> significant main effect "treatment" (whose levels are c("control","A","B")).
> For the specific contrast between treatment "A"
> and the "control" group I can define "levelcomb" in the following
> (equivalent) ways:
>
> > levelcomb <- list(treatment=c("A","control")) levelcomb <-
> > list(treatment=c(A=1,control=-1)) levelcomb <-
> > list(treatment=c(-1,1,0))
>
> Now, let's suppose that I am interested in the (marginally) significant
> interaction between treatment and phase. First I test the simple main effect
> of phase for different levels of treament (e.g. for the "control"
> group). To do this, levelcomb must have one variable for each interacting
> factor (treatment and phase): levelcomb$treatment will specify the treatment
> that I want to fix for the simple main effects test ("control"), and
> levelcomb$phase will have a NA value to represent that I want to test all
> orthogonal contrasts within that factor:
>
> > levelcomb <- list(treatment="control",phase=NA)
>
> I could also use numeric vectors to define the levels of "treatment"
> that I want to fix, as in the previous example, or if I want a more
> complicated combination (e.g. if I want to test the phase effect for pooled
> treatments "A" and "B"):
>
> > levelcomb <- list(treatment=c(A=1,B=1),phase=NA)
>
> The NA value can be replaced by the specific contrast matrix that I would
> like to use. For instance, the previous statement is equivalent to the
> following one:
>
> > levelcomb <- list(treatment=c(0,1,1),phase=contr.sum(levels(phase)))
>
> And then, let's see an example of interaction contrast: I want to see if the
> difference between the "pre-test" phase and the following two, itself differs
> between the "control" group and the two treatments. This would be
>
> > levelcomb <- list(treatment=c(2,-1,-1),phase=c(2,-1,-1))
> or
> > levelcomb <-
> list(treatment=c(control=2,A=-1,B=-1),phase=c(pretest=2,posttest=-
> 1,followup=-1))
>
> etc.
>
> Now, to perform the test I just use this "levelcomb" list object with the
> model and ANOVA previously performed, in the following way:
>
> > factorltest.mlm(mod.ok,levelcomb,av.ok)
>
> Or if I want to use idata and idesign directly:
>
> > factorltest.mlm(mod.ok,levelcomb,idata=idata,idesign=~phase*hour)
>
> Of course, this function only performs one test at a time. To perform
> multiple tests as suggested by John, the user should run it in a loop (and
> then correct the p-values as he or she see fit).
>
> If you find this useful, please let me know if you also find some error,
> opportunity of improvement, or any commentary.
>
> Thanks,
> Helios
>
> [Now follows the full commented code:]
>
> # factorltest.mlm(modmlm,levelcomb,aovmlm,...)
> #
> # Hypothesis testing for linear contrasts of factor levels, for multivariate
> # linear models with response transformation, as in a repeated-measures
> ANOVA.
> #
> # Arguments:
> # modmlm: multivariate linear model (mlm object) # levelcomb: list with the
> combinations of levels for each factor, in form of # a literal expression or
> numeric matrix (see Details).
> # aovmlm: result from Anova to modmlm (i.e. Anova.mlm object) # ...:
> additional arguments passed to Anova() --from the car package-- to # perform
> an ANOVA with an intra-subject design (see Details).
> #
> # Details:
> # An intra-subject design is required for the response transformation of the
> # linear model. The intra-subject design may be defined by the arguments #
> idata, idesgin (and optionally icontrasts) defined for the function
> Anova()
> # in the car package, or alternatively by an Anova.mlm object resulting from
> # that function (applied to modmlm). If both modes of defining the intra-
> subjects # design are used, the design implied by aovmlm prevales.
> #
> # The name of each element in levelcomb must coincide with a factor (i.e. a
> main # effect) analysed in the ANOVA (as defined in aovmlm or indirectly by
> modmlm and # the other arguments). The value of each element can be: (1) the
> name of one level # of that factor, (2) a vector with two level names of that
> factor --to perform # a paired contrast between those levels--, (3) a named
> vector of coefficients # whose names coincide with some levels of the factor,
> for more specific linear # combinations --unspecified coefficients are
> assumed to be zero--, (4) an unnamed # vector of coefficients whose length is
> equal to the number of levels in the # factor --the values of the vector are
> assigned to the factor levels in the # default order--, (5) a matrix with
> named or unnamed rows, in which each column # represents a combination of
> levels as in (3) or (4).
> #
> # This function depends on linearHypothesis() in the car package. The
> combinations # specified in levelcomb are tested against zero or a matrix of
> zeros.
>
>
>
> factorltest.mlm <- function(modmlm,levelcomb,aovmlm,...){
> ## AUXILIARY FUNCTIONS
> # checkfactors() is used to check consistency of level combinations in
> the
> # factors defined in levelcomb, and convert them to numeric matrices
> checkfactors <- function(fframe,fnames,levelcomb){
> for (fname in fnames){
> f <- fframe[[fname]]
> fc <- levelcomb[[fname]]
> # Literal expressions must represent one level or a pair of
> levels
> if ((is.character(fc)) & (!any(is.na(fc)))){
> fcnames <- fc
> if (length(fc)==1){
> # If there is one level,
> evaluate the values at that level
> fc <- 1
> names(fc) <- fcnames
> }else if (length(fcnames)==2){
> # If there is a pair of levels,
> evaluate the contrast
> fc <- c(1,-1)
> names(fc) <- fcnames
> }else{
> stop("Incorrect number of
> literal levels for factor \"",fname,"\" (must be 1 or 2).")
> }
> }
> # Check the numeric coefficients of the factor levels
> if ((is.numeric(fc)) & (!any(is.na(fc)))){
> # Make fc into a matrix (in case it was a vector)
> fc <- as.matrix(fc)
> # Unnamed coefficient matrices must have the same rows
> as levels in the factor
> if (is.null(rownames(fc))){
> if (nrow(fc) != nlevels(f)){
> stop("Mismatch in the
> number of unnamed factor levels for \"",fname,"\".")
> }else{
> rownames(fc) <-
> levels(f)
> }
> }else{
> # Named coefficients are
> assigned to a matrix of factor levels, filled in with zeros
> flevels <-
> match(rownames(fc),levels(f))
> fctmp <- fc
> fc <-
> matrix(0,nrow=nlevels(f),ncol=ncol(fctmp))
> if (any(is.na(flevels))){
> stop("Mismatch in the
> names of factor levels for \"",fname,"\".")
> }
> fc[flevels,] <- fctmp
> rownames(fc) <- levels(f)
> }
> # Convert NA into default contrast matrix
> }else if (is.na(fc)){
> fc <- contrasts(f)
> }else{
> stop("Invalid value assigned to factor
> \"",fname,"\".")
> }
> levelcomb[[fname]] <- fc
> }
> return(levelcomb)
> }
>
> # makeT() creates a transformation matrix, used to define the Linear
> Hypothesis
> # matrix (for between-subjects factors) or the response transformation
> matrix
> # (for within-subjects factors), depending on the combinations of
> factor levels.
> # The transformation matrix is created progressively, by "translating"
> the combinations
> # of each factor into matrices that are sequentially multiplied.
> makeT <- function(fframe,fnames,levelcomb){
> # First matrix, defined from the identic rows of the
> between/within-subjects
> # model data frame, when unspecified factors are removed.
> # A dummy column with ones is added to the model data frame, to
> avoid
> # problems when all columns be eventually removed.
> # (The name of the dummy column is coerced to be different from
> any other one.)
> dummyname <- paste("z",max(fnames),sep=".")
> fframe[[dummyname]] <- 1
> # All factors are interacting, the transformation matrix
> (Tm) in the first step is diagonal.
> if (length(fnames)==ncol(fframe)-1){
> m <- nrow(fframe)
> Tm <- diag(m)
> }else{
> # Remove columns of unspecified factors
> fframe <- fframe[,c(fnames,dummyname)]
> # Vector with a different value for each combination of
> factors
> fframe_1 <- apply(fframe,1,paste,collapse="")
> n <- nrow(fframe)
> # Subset of unique elements
> fframe <- unique(fframe)
> fframe_1.m <- unique(fframe_1)
> m <- nrow(fframe)
> # Matrix with coefficients for averaging identical
> combinations of factors
> # Rows in Mo are the original (repeated) combinations
> To <- matrix(rep(fframe_1,each=m),nrow=m)
> # Columns in Mb are the unique combinations
> Tu <- matrix(rep(fframe_1.m,n),nrow=m)
> Tm <- (To==Tu) * m/n
> }
> # Number of contrasts to calculate for the current factor
> (initialized as 1)
> nc <- 1
> # Labels for the final transformation matrix (only defined for
> multiple contrasts)
> Tlabels <- NULL
> # Progressive transformation of Tm, factor by factor
> for (fname in fnames){
> # f is the current factor vector
> f <- fframe[[fname]]
> # n is the factor vector length
> n <- m*nc
> # Remove the corresponding column from the model data frame
> fframe[[fname]] <- NULL
> # nc is the number of contrasts for the current factor
> (updated)
> nc <- ncol(levelcomb[[fname]])
> if (nc > 1){
> # If there are multiple contrasts, the rows of the
> model data frame are
> # replicated (by the number of
> contrasts), and a new column is added to
> # assign a contrast to each group of copied rows.
> fframe[[ncol(fframe)+1]] <- 1
> fframe <- fframe[rep(1:n,nc),]
> fframe[[ncol(fframe)]] <-
> rep(1:nc,each=n)
> # Moreover, we create labels to identify what contrast
> is represented
> # by each row of the final
> transformation matrix
> newTlabels <-
> paste(fname,as.character(1:nc),sep="")
> if (is.null(Tlabels)){
> Tlabels <- newTlabels
> }else{
> # (In case there is more than
> one factor with multiple contrasts)
> Tlabels <-
> paste(rep(Tlabels,nc), rep(newTlabels,each=length(Tlabels)), sep=":")
> }
> }
> # The same routine as for the first Tm: vector with
> combined values
> # of the (transformed) model data frame...
> fframe_1 <- apply(fframe,1,paste,collapse="")
> # ... subset to unique rows
> fframe <- unique(fframe)
> fframe_1.m <- unique(fframe_1)
> m <- nrow(fframe)/nc
> # And create transformation matrix, depending on the
> combination of factor
> # levels defined in levelcomb
> kf <- t(levelcomb[[fname]])
> kf <- matrix(rep(kf[,f],each=m),ncol=n)
> To <-
> matrix(rep(matrix(fframe_1,ncol=n,byrow=TRUE),each=m),ncol=n)
> Tu <- matrix(rep(fframe_1.m,n),ncol=n)
> Tm <- ((To==Tu) * kf) %*% Tm
> }
> # When the loop is finished, assign labels to final Tm, and
> return it
> rownames(Tm) <- Tlabels
> return(Tm)
> }
>
> ## END OF AUXILIARY FUNCTIONS
> ## MAIN PROCEDURE
>
> # 1. Check class of modmlm and intra-subject design from input
> arguments
> if (missing(aovmlm)){
> aovmlm <- Anova(modmlm,...)
> }
>
> # 2. Define model data frames:
> # Between-subjects model data frame, with contrasts copied from linear
> model
> bframe <- expand.grid(modmlm$xlevels)
> bfactors <- names(bframe)
> for (bf in bfactors){
> contrasts(bframe[[bf]]) <- modmlm$contrasts[[bf]]
> }
> # Within-subjects model data frame, with contrasts copied from intra-
> subject design
> wframe <- aovmlm$idata
> wfactors <- names(wframe)
> for (wf in wfactors){
> if (is.null(attr(wframe[[wf]], "contrasts"))){
> contrasts(wframe[[wf]]) <- if
> (is.ordered(wframe[[wf]])) aovmlm$icontrasts[2] else aovmlm$icontrasts[1]
> }
> }
>
> # 3. Check that interacting factors in levelcomb are included in both
> the
> # between-subjects and within-subject designs
> ifactors <- names(levelcomb)
> iterm <- paste(ifactors,collapse=":")
> itermsort <- paste(sort(ifactors),collapse=":")
> anovaterms <-
> lapply(strsplit(aovmlm$terms,":"),function(x){paste(sort(x),collapse=":")})
> if (all(is.na(match(anovaterms,itermsort)))){
> warning("The term \"", iterm, "\" is not in the model
> design.")
> }
>
> # 4. Search factors of levelcomb in bframe and wframe, and convert the
> level
> # combinations into numeric matrix format
> bfactor.interact <- match(ifactors,bfactors)
> bfactors <-
> bfactors[bfactor.interact[!is.na(bfactor.interact)]]
> levelcomb <- checkfactors(bframe,bfactors,levelcomb)
> wfactor.interact <- match(ifactors,wfactors)
> wfactors <-
> wfactors[wfactor.interact[!is.na(wfactor.interact)]]
> levelcomb <- checkfactors(wframe,wfactors,levelcomb)
>
> # 5. Preliminary definition of the Linear Hypothesis matrix (L)
> rhf <- paste(as.character(formula(modmlm))[c(1,3)],collapse="
> ")
> L <- model.matrix(as.formula(rhf), data=bframe)
>
> # 6. Transformed Linear Hypothesis (L) and response transformation (P)
> matrices
> if (length(bfactors)){
> L <- makeT(bframe,bfactors,levelcomb) %*% L
> }else{
> L <- colSums(L)/nrow(L)
> }
> if (length(wfactors)){
> P <- t(makeT(wframe,wfactors,levelcomb))
> }else{
> P <- matrix(rep(1/nrow(wframe),nrow(wframe)))
> }
>
> # 7. Result, consisting in:
> # levelcomb: numeric matrix values of levelcomb
> # lsm: table of least square means for the tested
> interactions
> # test: test value, from LinearHypothesis
> result <- list(levelcomb=levelcomb,lsm=NULL,test=NULL)
> result$lsm <- L %*% modmlm$coefficients %*% P
> result$test <- linearHypothesis(modmlm,L,P=P)
> return(result)
> }
>
> INSTITUTO DE BIOMECÁNICA DE VALENCIA
> Universidad Politécnica de Valencia • Edificio 9C Camino de Vera s/n • 46022
> VALENCIA (ESPAÑA) Tel. +34 96 387 91 60 • Fax +34 96 387 91 69 www.ibv.org
>
> Antes de imprimir este e-mail piense bien si es necesario hacerlo.
> En cumplimiento de la Ley Orgánica 15/1999 reguladora de la Protección de
> Datos de Carácter Personal, le informamos de que el presente mensaje contiene
> información confidencial, siendo para uso exclusivo del destinatario arriba
> indicado. En caso de no ser usted el destinatario del mismo le informamos que
> su recepción no le autoriza a su divulgación o reproducción por cualquier
> medio, debiendo destruirlo de inmediato, rogándole lo notifique al remitente.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list