[R] predict.lm(...,type="terms") question

peter dalgaard pdalgd at gmail.com
Sun Sep 2 08:35:14 CEST 2012


On Sep 2, 2012, at 03:38 , David Winsemius wrote:

> 
> Why should predict not complain when it is offered a newdata argument that does no contain a vector of values for "x"? The whole point of the terms method of prediction is to offer estimates for specific values of items on the RHS of the formula. The OP seems to have trouble understanding that point. Putting in a vector with the name of the LHS item makes no sense to me. I certainly cannot see that any particular behavior for this pathological input is described for predict.lm in its help page, but throwing an error seems perfectly reasonable to me.

Yes. Lots of confusion going on here. 

First, data= is _always_ used as the _first_ place to look for variables, if things are not in it, search continues into the formula's environment. To be slightly perverse, notice that even this works:

> y <- rnorm(10)
> x <- rnorm(10)
> d <- data.frame(z=rnorm(9))
> lm(y ~ x, d)

Call:
lm(formula = y ~ x, data = d)

Coefficients:
(Intercept)            x  
    -0.2760       0.2328  

Secondly, what is predict(..., type="terms") supposed to have to do with inverting a regression equation? That's just not what it does, it only splits the prediction formula into its constituent terms.

Thirdly; no, you do not invert a regression equation by regressing y on x. That only works if you can be sure that your new (x, y) are sampled from the same population as the data, which is not going to be the case if you are fitting to data with, say, selected equispaced x values. There's a whole literature on how to do this properly, Google e.g. "inverse calibration" for enlightenment.  

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com




More information about the R-help mailing list