[R] Automatically Remove Aliased Terms from a Model

Thaler,Thorn,LAUSANNE,Applied Mathematics Thorn.Thaler at rdls.nestle.com
Mon Oct 28 17:19:59 CET 2013

Dear all,

I am trying to implement a function which removes aliased terms from a model. The challenge I am facing is that with "alias" I get the aliased coefficients of the model, which I have to translate into the terms from the model formula. What I have tried so far:

d <- expand.grid(a = 0:1, b=0:1)
d$c <- (d$a + d$b)  %% 2
d$y <- rnorm(4)
d <- within(d, {a <- factor(a); b <- factor(b); c <- factor(c)})
l <- lm(y ~ a * b + c, d)

removeAliased <- function(mod) {
  ## Retrieve all terms in the model
  X <- attr(mod$terms, "term.label")
  ## Get the aliased coefficients  
  rn <- rownames(alias(mod)$Complete)
  ## remove factor levels from coefficient names to retrieve the terms
  regex.base <- unique(unlist(lapply(mod$model[, sapply(mod$model, is.factor)], levels)))
  aliased <- gsub(paste(regex.base, "$", sep = "", collapse = "|"),  "", gsub(paste(regex.base, ":", sep = "", collapse = "|"), ":", rn))
  uF <- formula(paste(". ~ .", paste(aliased, collapse = "-"), sep = "-"))
  update(mod, uF)


This function works in principle, but this workaround with removing the factor levels is just, well, a workaround which could cause problems in some circumstances (when the name of a level matches the end of another variable, when I use a different contrast and R names the coefficients differently etc. - and I am not sure which other cases I am overlooking).

So my question is whether there are some more intelligent ways of doing what I want to achieve? Is there a function to translate a coefficient of a LM back to the term, something like:

termFromCoef("a1") ## a1
termFromCoef("a1:b1") ## a:b

With this I could simply translate the rownames from alias into the terms needed for the model update.

Thanks for your help.

Kind Regards,

Thorn Thaler 
NRC Lausanne
Applied Mathematics

More information about the R-help mailing list