[R] environments again

Thomas Lumley tlumley at u.washington.edu
Mon Dec 17 23:55:34 CET 2001

Yes, there's a bug. It's not as simple as your email suggests -- and it
provides a nice illustration of why a reproducible example is much more
helpful than  a hypothesis ("adding an extra argument makes it unable to
find the first argument")

 Also note that the problem doesn't happen if the variables are in a
data= argument, which is a simple way to stop this happening and is
generally a Good Thing.

Read on for more detail than you probably want. Here's a simpler version
that shows the problem and doesn't use common variable names -- with your
functions and the internals of aov there were too many things called `y'
for my taste.


  print(aov(why~ex)) 		# works
  print(aov(why~ex+Error(ess))) # doesn't

So the problem has something to do with Error() terms.

The traceback() shows that the error occurs inside aov, when it is creates
a new lm(why~ess) call to handle the Error(). At this point we have

  Browse[1]> ecall
  lm(formula = why ~ ess, singular.ok = TRUE, method = "qr", qr = TRUE)
  Browse[1]> eval(ecall,parent.frame())
  Error in eval(expr, envir, enclos) : Object "why" not found

but evaluating seemly the same explicit formula works

  Browse[1]> eval(quote(lm(formula = why ~ ess, singular.ok = TRUE, method =
  "qr", qr = TRUE)),parent.frame())

  lm(formula = why ~ ess, method = "qr", qr = TRUE, singular.ok = TRUE)

  (Intercept)          ess
          2.5          1.0

This suggests that we have a problem with formula environments, and indeed
 Browse[1]> ls(env=environment(formula(ecall)))
  [1] "Call"        "Terms"       "allTerms"    "contrasts"   "data"
  [6] "eTerm"       "ecall"       "errorterm"   "formula"     "indError"
 [11] "intercept"   "lmcall"      "opcons"      "projections" "qr"
where the original formula argument has
 Browse[1]> ls(env=environment(formula))
 [1] "ess" "ex"  "why"
agreeing with
 Browse[1]> ls(env=parent.frame())
 [1] "ess" "ex"  "why"

So it's a bug in aov() caused by the relatively new scoping rules for
formulas, where variables that aren't found in a specified data frame are
now sought in the environment of the formula.

In most cases this is an improvement over the previous rules, but it
causes problems for functions that do surgery on formulas, like aov() and

I think a fix should be simple but it may be too late for 1.4.0, which is
due nearly tomorrow.


r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list