[R] Warnings by functions mean(), median()

Prof Brian Ripley ripley at stats.ox.ac.uk
Sat Feb 19 13:58:27 CET 2005


On Sat, 19 Feb 2005 mailpuls at gmx.net wrote:

> following functions doesnt work correct with my data: median(), geo.mean().
>
> My datafiles contain more than 10.000 lines and six columns from a 
> flow-cytometer-measurment. I need the arithmetic and geometric mean and 
> median. For the calculation of the geometric mean i wrote following function:
>
>    fix(geo.mean)
>
>    function(x)
>    {
>        n<-length(x)
>        gm<-prod(x)^(1/n)
>        return(gm)
>    }
>
> The function median() error tells me "need numeric data". The data are 
> numeric. The function geo.mean() gave out "[1] NaN". What are the reasons and 
> what are the solutions?
>
> I'am a newbie and need urgently information.

0) `data' is a bad choice of name as it masks an R system function.

1) `data' appears to be a data frame, not numeric data, as median says.
Do you want a summary for each column or the whole table?
So you need sapply(data, median) or median(as.matrix(data)).

2) Your function is trying to take a fractional power of 0, and what you 
think that is?  (0)  However, it is liable to under/overflow (10000 
numbers of size 100 have product 10^20000, way more than IEC60559 
arithmetic can represent, so you have (Inf*0)^(1/100001) = NaN).  You want 
something like

geo.mean <- function(x)
{
     if(any(x < 0)) stop("need positive data")
     exp(mean(log(x)))
}

which will even work for a data frame.  But I can tell you the answer is 
0 for the data you show.

For more information, see `An Introduction to R' or a good book on data 
manipulation with S/R, plus Numerical Analysis 101.


> Here is an short output with the results:
>
> 9997   385.42   68.54   9.82  124.09  23.93  138.24
> 9998   342.89   73.65 133.35 1134.19 345.99 1876.88
> 9999   316.23   76.35  48.26  421.70 129.80  873.79
> 10000  291.64  103.66   6.85  107.46  26.42  189.38
> 10001    0.00    0.00   0.00    0.00   0.00    0.00
>> mean(data)
>      FSC       SSC       FL1       FL2      FL32       FL4
> 375.94880  73.76219  50.73413 434.42837 110.06393 637.34980
>> geo.mean(data)
> [1] NaN
>> median(data)
> Error in median(data) : need numeric data

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list