[R] what to do if residuals produced by lm() have long tails?
Prof Brian Ripley
ripley at stats.ox.ac.uk
Fri Nov 30 07:54:47 CET 2007
On Thu, 29 Nov 2007, tom soyer wrote:
> Hi,
>
> I am using lm() for regression analysis of my data set. My regression
> results look pretty good, i.e., the coefficient is significant and the p
> value is much less than 0.05. But when I checked the residuals, both using
> qqnorm() and hist(), the distribution does not look normal. It looks like
> the residuals have long tails. I assume that lm() uses OLS, and since one of
> the assumptions of OLS is that the residuals has to be normally distributed,
> I am wondering if this means I should reject my regression results all
> together. If so, then what should I use instead? Are there ways to deal with
> distributions with long tails using lm() or OLS, or entirely different
> models are needed instead?
The main point is that least squares is rather inefficient with
long-tailed error distributions. Robust methods are designed to be
efficient for a wide class of long-tailed distributions, and so are
preferable. Use e.g. rlm (package MASS) or lmRob (package robust) in
place of lm. If this makes a different to your 'regression results', then
yes, you need to reject the least-squares results.
This is discussed in good texts on doing statistics with R, e.g. MASS.
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list