[R] Robust Non-linear Regression
Martin Maechler
maechler at stat.math.ethz.ch
Mon Nov 14 12:41:16 CET 2005
Package 'sfsmisc' has had a function 'rnls()' for a while
which does robust non-linear regression via M-estimation.
[The name of the function is probably *really* a misnomer,
since the 'ls' part stands for "least squares"!]
Two weeks ago, there's been a small workshop
"Robustness and R" in Treviso (It),
http://www.dst.unive.it/rsr/
where we've talked about available and missing robustness
functionality ``in R''. One consequence of the workshop is the
new mailing list "R-SIG-Robust" {to which I CC this message}
and another planned and hopefully even more consequential
consequence will be collaboration on producing more widely
available robustness functionality for R. Do subscribe to the
list if you are interested.
>>>>> "Vermeiren" == Vermeiren, Hans [VRCBE] <Vermeiren>
>>>>> on Sun, 13 Nov 2005 22:47:31 +0100 writes:
Vermeiren> Hi, I'm trying to use Robust non-linear
Vermeiren> regression to fit dose response curves. Maybe I
Vermeiren> didnt look good enough, but I dind't find robust
Vermeiren> methods for NON linear regression implemented in
Vermeiren> R. A method that looked good to me but is
Vermeiren> unfortunately not (yet) implemented in R is
Vermeiren> described in
Vermeiren> http://www.graphpad.com/articles/RobustNonlinearRegression_files/frame.htm
Vermeiren> in short: instead of using the premise that the
Vermeiren> residuals are gaussian they propose a Lorentzian
Vermeiren> distribution, in stead of minimizing the squared
Vermeiren> residus SUM (Y-Yhat)^2, the objective function is
Vermeiren> now SUM log(1+(Y-Yhat)^2/ RobustSD)
Vermeiren> where RobustSD is the 68th percentile of the
Vermeiren> absolute value of the residues
Vermeiren> my question is: is there a smart and elegant way
Vermeiren> to change to objective function from squared
Vermeiren> Distance to log(1+D^2/Rsd^2) ?
no; not easily.
Vermeiren> or alternatively to write this as a weighted
Vermeiren> non-linear regression where the weights are
Vermeiren> recalculated during the iterations in nlme it is
Vermeiren> possible to specify weights, possibly that is the
Vermeiren> way to do it, but I didn't manage to get it
Vermeiren> working the weights should then be something
Vermeiren> like:
Vermeiren> SUM (log(1+(resid(.)/quantile(all_residuals,0.68))^2))
Vermeiren> / SUM (resid(.))
rnls() mentioned does use robust weights and IRLS (iteratively
reweighted LS) making use of nls() and rlm(),
similarly to your suggestion.
Vermeiren> the test data I use :
Vermeiren> x<-seq(-5,-2,length=50)
Vermeiren> x<-rep(x,4)
Vermeiren> y<-SSfpl(x,0,100,-3.5,1)
Vermeiren> y<-y+rnorm(length(y),sd=5)
Vermeiren> y[sample(1:length(y),floor(length(y)/50))]<-200 # add 2% outliers at 200
Since you have only outliers in 'y' and none in 'x',
you could use the 'nlrq' (nonlinear regression quantiles)
package that Roger Koenker mentioned.
To really robustify such self starting models as the 4-parameter
logistic 'SSfpl' above, you would also need to provide a robust
initial estimator;
maybe that could be done pretty easily 'rlm()' instead of 'lm()' and
using 'rnls()' instead of 'nls()' also for the "initial" part in
something like
SSfpl.rob <-
selfStart(~ A + (B - A)/(1 + exp((xmid - input)/scal)),
initial = function( ...) { ...... },
parameters= c("A","B","xmid","scal"))
{look at 'SSfpl() for the initial estimator}.
However, BTW, currently the "plinear" version fails for our robust
nonlinear procedure 'rnls()'.
Martin Maechler, ETH Zurich
More information about the R-help
mailing list