[R] implementing Grubbs outlier test on a large dataframe
    Frank E Harrell Jr 
    f.harrell at vanderbilt.edu
       
    Sun Feb 15 01:23:52 CET 2009
    
    
  
John Malone wrote:
> Hi!
> 
> I'm trying to implement an outlier test once/row in a large dataframe.
> Ideally, I'd do this then add the Pvalue results and the number flagged as
> an outlier as two new separate columns to the dataframe.  Grubbs outlier
> test requires a vector and I'm confused how to make each row of my dataframe
> a vector, followed by doing a Grubbs test for each row containing the vector
> of numbers I want to perform the outlier test on.
> 
> I'm new to R and no doubt this is a simple problem. Any help you might
> provide would be greatly appreciated.
> 
> Many thanks in advance!!
> 
> 	[[alternative HTML version deleted]]
> 
John - you would be making a strong normality assumption.  You might 
reject H0 using Grubbs' test just because of non-normality, or you might 
fail to reject it just because of non-normality.  Is it really this 
straitforward to declare something an outlier?  What does outlier really 
mean?
The following is must reading.
@Article{fin06cal,
   author =               {Finney, David J.},
   title =                {Calibration guidelines challenge outlier 
practices},
   journal =      The American Statistician,
   year =                 2006,
   volume =               60,
   pages =                {309-313},
   annote =               {anticoagulant
therapy;bias;causation;ethics;objectivity;outliers;guidelines for
treatment of outliers;overview of types of outliers;letter to the editor 
and reply 61:187 May 2007}
-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University
    
    
More information about the R-help
mailing list