[R] Help with predict.lm

Mike White mikewhite.diu at tiscali.co.uk
Tue Apr 19 17:53:10 CEST 2005


Ted
I agree that with the example data the "inverse regression" approach would
be adequate, but it would be useful to have a function to predict the
results and confidence intervals correctly. I look forward to your solution
to the calibration problem.

In the mean time, I have looked at the calib function referred to by Andy
Liaw (http://finzi.psych.upenn.edu/R/Rhelp02a/archive/1202.html) but the
formula for the confidence intervals appears to be incorrect (at least
according to "Quality Assurance in Analytical Chemistry, W.Funk, V. Dammann
and G.Donnevert). The brackets multipled by term1 should have been to the
power 0.5 (square root)  not 2 (squared). Also the function needs be
multiplied by the t value for the appropriate probability and degrees of
freedom. May corrected code is shown below, any suggestions for calculating
t?


# R function to do classical calibration
# input the x and y vectors and the value of y to make an
# estimate for
# val is a value of dep
# returns the calibrated value of x and it's
# single confidence interval about x
# form of calibration particularly suited to jackknifing
calib <- function(indep, dep, val)
{      # generate the model y on x and get out predicted values for x
        reg <- lm(dep~indep)
        xpre <- (val - coef(reg)[1])/coef(reg)[2]

        # generate a confidence
        yyat <- ((dep - predict(reg))^2)
        sigyyat <- ((sum(yyat)/(length(dep)-2))^0.5)
        term1 <- sigyyat/coef(reg)[2]
        sigxxbar <- sum((indep - mean(indep))^2)
        denom <- sigxxbar * ((coef(reg)[2])^2)
         t<- ## needs function to assign t value from t-table
        conf <- t*abs(((1+(1/length(dep))+(((val -
mean(dep))^2)/denom))^0.5)*term1) ## squared term changed to square root
results<-list(Predicted=xpre, Confidence=conf)  ## returns results as list
to avoid warning message
return(results)
}

Thanks
Mike White
----- Original Message -----
From: "Ted Harding" <Ted.Harding at nessie.mcc.ac.uk>
To: "Mike White" <mikewhite.diu at tiscali.co.uk>
Cc: <R-help at stat.math.ethz.ch>
Sent: Tuesday, April 19, 2005 3:20 PM
Subject: RE: [R] Help with predict.lm


> On 19-Apr-05 Mike White wrote:
> > Hi
> > I have measured the UV absorbance (abs) of 10 solutions
> > of a substance at known concentrations (conc) and have
> > used a linear model to plot a calibration graph with
> > confidence limits.  I now want to predict the concentration
> > of solutions with UV absorbance results given in the
> > new.abs data.frame, however predict.lm only appears to work
> > for new "conc" variables not new "abs" variables.
> >
> > I have search the help files and did find a similar problem
> > in June 2000, but unfortunately no solution was offered.
> > Any help and how to use predict.lm with the new "abs" data
> > to predict "conc" with confidence limits would be appreciated.
> >
> >     conc<-seq(100, 280, 20) #  mg/l
> >     abs<-c(1.064, 1.177, 1.303, 1.414, 1.534, 1.642, 1.744,
> >            1.852, 1.936,2.046) # absorbance units
> >     lm.calibration<-lm(abs ~ conc)
> >     pred.w.plim <- predict(lm.calibration,  interval="prediction")
> >     pred.w.clim <- predict(lm.calibration,  interval="confidence")
> >     matplot(conc, cbind(pred.w.clim, pred.w.plim[,-1]),
> >        lty=c(1,2,2,3,3), type="l", ylab="abs", xlab= "conc mg/l")
> >     points(conc, abs, pch=21, col="blue")
> >
> >     new.abs<-data.frame(abs=c(1.251, 1.324, 1.452))
> >
> >     predict(calibration.lm, new.abs) # does not work
>
> Apart from the apparent typo "calibration.lm" which Matthias pointed
> out, this will not work anyway since "predict" predicts in the
> same direction as the model was fitted in the first place.
>
> You have fitted "abs ~ conc", so the lm object lm.calibration
> contains a representation of the model equivalent to
>
>   abs = a + b*conc
>
> When you use predict(calibration.lm, new.abs) it will expect
> the dataframe "new.abs" to contain values in a list named "conc".
> Since you have supplied a list named "abs", this is apparently
> ignored, and as a result the function uses the default dataframe
> which is the data encapsulated in the calibration.lm object.
>
> You can verify this by comparing the outputs of
>
>   predict(lm.calibration, new.abs)
>
> and
>
>   predict(lm.calibration)
>
> You will see that they are identical!
>
> Either way, "predict" predicts "ans" from "conc" by using the
> above equation. It does not attempt to invert the equation
> even if you call the "new" data "abs".
>
> Given the apparently extremely close fit, in your data, to
> a straight line, you are likely to get perfectly adequate
> results simply by doing the regression the other way round,
> i.e.
>
> > lm.calibration2<-lm(conc ~ abs)
> > new.abs<-data.frame(abs=c(1.251, 1.324, 1.452))
> > predict(lm.calibration2,new.abs)
>        1        2        3
> 131.3813 144.7453 168.1782
>
> which looks about right!
>
> Strictly speaking, this approach is somewhat invalid, since
> under the model
>
>    abs = a + b*conc + error
>
> the "inverse regression"
>
>    conc = (abs - a)/b = A + B*abs
>
> does not have the standard statistical properties because
> of the theoretical possiblity (P > 0) that the estimated b
> can be arbitrarily close to 0 (with the result that, theoretically,
> the estimate of B = 1/b has neither expectation nor variance),
> and the estimates of A and B could be a long way out.
>
> Also, the confidence intervals you would get by using the
> residuals from this "inverse regression" in the usual way
> would not be valid, since they would be finite for all
> values of the confidence level. In reality, for a sufficiently
> high confidence level (but still short of 100%) you will get
> a confidence "interval" consisting of several parts with
> gaps between them, or an infinite confidence interval.
>
> This takes place at confidence levels corresponding to
> the significance level at which you can no longer reject
> the hypothesis "b = 0" in the first euqation. In your case
> this significance level would be extremely small (I get
> 2.71e-12, so you can get confidence intervals up to
> at least 99.999999999% confidence level before you need to
> worry much about the confidence interval!
>
> In your case where there seems to be a very tight linear
> relationship, the possible misrepresentation of the
> inverse relationship due to this effect is very unlikely
> to have occurred.
>
> The correct approach has in the past been the subject of
> at times quite controversial discussion, under the title
> indeed of "The Calibration Problem". Nowadays this problem
> would be approached by making the concentrations to be
> "predicted" additional unknown parameters, and evaluating
> likelihood ratios for possible values of these.
>
> I don't have time at the moment to go into this approach,
> but will try to write something later.
>
> It seems there is nothing in R at present for this kind of
> general (and basically simple) use, though maybe there is
> something usable in the package mscalib (which I have not
> studied); however, by the look of the contents, the routines
> therein may be too elaborately specialised towards a specific
> type of application to be convenient for general use in
> conjunction with lm.
>
> Till later,
> Ted.
>
>  it might be
> worth spelling it out for
> people who would like to implement it themselves. Later.
>
>
>
> --------------------------------------------------------------------
> E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
> Fax-to-email: +44 (0)870 094 0861
> Date: 19-Apr-05                                       Time: 15:20:26
> ------------------------------ XFMail ------------------------------
>




More information about the R-help mailing list