[R] outlier
Prof Brian Ripley
ripley at stats.ox.ac.uk
Tue Jun 17 18:51:32 CEST 2003
On Tue, 17 Jun 2003, kan Liu wrote:
> I want to calculate the R-squared between two variables. Can you advice
> me how to identify and remove the outliers before performing R-squared
> calculation?
Easy: you don't. It make no sense to consider R^2 after arbitrary outlier
removal: if I remove all but two points I get R^2 = 1!
R^2 is normally used to measure the success of a multiple regression, but
as you mention two variables, did you just mean the Pearson
product-moment correlation? It makes more sense to use a robust measure
of correlation, as in cov.rob (package lqs) or even Spearman or Kendall
measures (cov.test in package ctest).
If you intended to do this for a multiple regression, you need to do some
sort of robust regression and a use a robust measure of fit.
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list