[R] Warnings by functions mean(), median()
Prof Brian Ripley
ripley at stats.ox.ac.uk
Sat Feb 19 13:58:27 CET 2005
On Sat, 19 Feb 2005 mailpuls at gmx.net wrote:
> following functions doesnt work correct with my data: median(), geo.mean().
>
> My datafiles contain more than 10.000 lines and six columns from a
> flow-cytometer-measurment. I need the arithmetic and geometric mean and
> median. For the calculation of the geometric mean i wrote following function:
>
> fix(geo.mean)
>
> function(x)
> {
> n<-length(x)
> gm<-prod(x)^(1/n)
> return(gm)
> }
>
> The function median() error tells me "need numeric data". The data are
> numeric. The function geo.mean() gave out "[1] NaN". What are the reasons and
> what are the solutions?
>
> I'am a newbie and need urgently information.
0) `data' is a bad choice of name as it masks an R system function.
1) `data' appears to be a data frame, not numeric data, as median says.
Do you want a summary for each column or the whole table?
So you need sapply(data, median) or median(as.matrix(data)).
2) Your function is trying to take a fractional power of 0, and what you
think that is? (0) However, it is liable to under/overflow (10000
numbers of size 100 have product 10^20000, way more than IEC60559
arithmetic can represent, so you have (Inf*0)^(1/100001) = NaN). You want
something like
geo.mean <- function(x)
{
if(any(x < 0)) stop("need positive data")
exp(mean(log(x)))
}
which will even work for a data frame. But I can tell you the answer is
0 for the data you show.
For more information, see `An Introduction to R' or a good book on data
manipulation with S/R, plus Numerical Analysis 101.
> Here is an short output with the results:
>
> 9997 385.42 68.54 9.82 124.09 23.93 138.24
> 9998 342.89 73.65 133.35 1134.19 345.99 1876.88
> 9999 316.23 76.35 48.26 421.70 129.80 873.79
> 10000 291.64 103.66 6.85 107.46 26.42 189.38
> 10001 0.00 0.00 0.00 0.00 0.00 0.00
>> mean(data)
> FSC SSC FL1 FL2 FL32 FL4
> 375.94880 73.76219 50.73413 434.42837 110.06393 637.34980
>> geo.mean(data)
> [1] NaN
>> median(data)
> Error in median(data) : need numeric data
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list