[R] alternatives to traditional least squares method in linear regression ?

Liaw, Andy andy_liaw at merck.com
Fri Nov 30 20:05:55 CET 2007


Coming to this late, but hopefully not too late...

You may want to try mixture of regression models:

install.packages("flexmix")
require("flexmix")

## simulate some data
x1 <- rnorm(100, sd=5)
y1 <- rnorm(100, mean=x1)
x2 <- rnorm(50, sd=5)
y2 <- rnorm(50, mean=-5 + 0.5 * x2)
x <- c(x1, x2)
y <- c(y1, y2)
plot(x, y)
fit <- flexmix(y ~ x, k=2)
parameters(fit)

Andy 

From: Wolfgang Raffelsberger
> 
> Dear list,
> 
> I have encountered a special case for searching a linear regression 
> where I'm not satisfied with the results obtained using the 
> traditional 
> least squares method (sometimes called OLS) for estimating/optimizing 
> the residues to the regression line (see code below).  Basically, a 
> group of my x-y data are a bit off the diagonal line (in my case the 
> diagonal represents the ideal or theoretical fit between x 
> and y, which 
> are in the same scale) and thus these points have sufficient power to 
> impose a slope deviating (too much) from the diagonal. Using rlm() 
> didn't help since this is not a problem of rare outliers.
>  From a pragmatic point of  view using a linear regression 
> approach does 
> fit very well the nature of the data & comparison I'd like to 
> perform, 
> so that's why I'd like to stay with something linear.
> 
> Has anybody already implemented a function or package in R 
> allowing to 
> modify the exponent (of the least squares method) or more general 
> allowing to define the model to be used for estimating/optimizing the 
> residues ?
> 
> Thank's in advance
> Wolfgang Raffelsberger
> 
> 
>  > plot(x,y)    # x and y are my data
>  > regr <- lm(y~x)
>  > abline(regr)
>  > # I'm not satisfied with the line since there is one group 
> of points 
> following very well the diagonal but the regression is deviated by 
> another group of points ...
>  >
>  > sessionInfo()
> R version 2.6.0 (2007-10-03)
> i386-pc-mingw32
> 
> locale:
> LC_COLLATE=French_France.1252;LC_CTYPE=French_France.1252;LC_M
> ONETARY=French_France.1252;LC_NUMERIC=C;LC_TIME=French_France.1252
> 
> attached base packages:
> [1] stats     graphics  grDevices datasets  tcltk     utils   
>   methods
> [8] base
> 
> other attached packages:
> [1] svSocket_0.9-5 svIO_0.9-5     R2HTML_1.58    svMisc_0.9-5 
>   svIDE_0.9-5
> 
> loaded via a namespace (and not attached):
> [1] tools_2.6.0
> 
> 
>  
> 
> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
> . . . . .
> 
> Wolfgang Raffelsberger, PhD
> Laboratoire de BioInformatique et Génomique Intégratives
> CNRS UMR7104, IGBMC
> 1 rue Laurent Fries,  67404 Illkirch  Strasbourg,  France
> Tel (+33) 388 65 3300         Fax (+33) 388 65 3276
> wolfgang.raffelsberger at igbmc.u-strasbg.fr
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 


------------------------------------------------------------------------------
Notice:  This e-mail message, together with any attachme...{{dropped:15}}



More information about the R-help mailing list