R-beta: formula() and model formulae

Thu May 8 02:13:42 CEST 1997

Peter Dalgaard writes:
 > Bill Venables <wvenable at attunga.stats.adelaide.edu.au> writes:
 > 
 > > Ah, the Helmert contrasts b\^ete noir.  For ANOVA the contrast
 > > matrix used is mostly irrelevant.  For regression models I agree,
 > > treatment contrasts would be generally more easily interpreted.
 > 
 > Understatement of the year... Last time I bumped into them, it took me
 > and a colleague more than an hour to figure out how to interpret the
 > regression coefficients, and, I may add, the solution was *not* what
 > the white book said it was (it's not just one level minus the average
 > of the preceding, the parameter is also scaled by the reciprocal of
 > the level number). [There's a split-second solution -- see below --
 > but we sort of didn't think of it at the time...] 

A few weeks ago I gave a fairly detailed discussion of how to
relate contrast matrices and their interpretation in s-news.  I
could re-issue it or post it to people if that was their wish.

There is also to be an extended discussion of the subject in V&R2
due out in July, with a further elaboration to appear (real soon
now...) in the online complements.

 > > I presume the reason they were used at all is because if you have
 > > equal replication of everything the Helmert contrasts give you a
 > > model matrix with orthogonal columns, so all estimates are
 > > uncorrelated.  Whenever do you get equal replication, though?
 > 
 > Hardly ever. Actually, I though that the point was not so much
 > ortogonality, but the successive testing (A=B, A=B=C, A=B=C=D,...).
 > However that is just plainly wrong outside of balanced ANOVA's.
 > And, even in that case, once the first two levels differ, the rest
 > of the coefficients lose all meaning.

Indeed.  That's why I tended to discount that possibility myself.
Here is a contrast matrix generator I sometimes prefer to use
that corresponds to testing A=B, B=C, C=D, ...  Of course the
contrasts are not mutually orthogonal.  How it works is left as a
little puzzle.  (This function works in S.  I haven't tested it
in R, but it should work if lower.tri() is available.)

contr.sdif <- function(n, contrasts = T)
{
# contrasts generator giving `successive difference' contrasts.
  if(is.numeric(n) && length(n) == 1) {
    if(n %% 1 || n < 2)
      stop("invalid number of levels")
    lab <- as.character(seq(n))
  }
  else {
    lab <- as.character(n)
    n <- length(n)
    if(n < 2)
      stop("invalid number of levels")
  }
  if(contrasts) {
    contr <- col(matrix(nrow = n, ncol = n - 1))
    upper.tri <- !lower.tri(contr)
    contr[upper.tri] <- contr[upper.tri] - n
    structure(contr/n, dimnames = list(lab, paste(
      lab[-1], lab[ - n], sep = "-")))
  }
  else structure(diag(n), dimnames = list(lab, lab))
}
> contr.sdif(4)
    2-1  3-2   4-3 
1 -0.75 -0.5 -0.25
2  0.25 -0.5 -0.25
3  0.25  0.5 -0.25
4  0.25  0.5  0.75

-- 
Bill Venables, Head, Dept of Statistics,    Tel.: +61 8 8303 5418
University of Adelaide,                     Fax.: +61 8 8303 3696
South AUSTRALIA.     5005.   Email: Bill.Venables at adelaide.edu.au
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=