[R] summary stats including NA's into new dataframe
Uwe Ligges
ligges at statistik.uni-dortmund.de
Thu Dec 19 08:52:04 CET 2002
Alexander.Herr at csiro.au wrote:
> Thanks Uwe,
> Can't seem to get your formula to work...
> I should have made this clearer. I am after a listing of the number of NAs
> and Valid Ns (or total N)for export to csv,eg:
> Variable, mean, Missing Values, Valid N
> test, 6.00000,2,18
> bummer,5.44444,1,19
>
> from:
>
> x<-c(1,4,2,6,8,3,5,6,7,8,7,2,4,7,5,1,8,9,8,9)
> labl<-gl(2,2,length=20,labels=c("test","bummer"))
> x[3]<-NA
> x[5]<-NA
> x[6]<-NA
>
>
> aggregate(x,by=list(labl),mean, sum(is.na(x)))
> # Group.1 x
> #1 test NA
> #2 bummer NA
>
> aggregate(x,by=list(labl),mean, na.rm=T)
> # Group.1 x
> #1 test 6.000000
> #2 bummer 5.444444
>
> aggregate(x,by=list(labl),sum(is.na(x)))
> # Error in FUN(X[[1]], ...) : Argument "INDEX" is missing, with no default
You didn't read carefully enough:
aggregate(......., function(x) sum(is.na(x)))
^^^^^^^^^^^^
Or instead of this anonymous function, you can do as well:
countna <- function(x) sum(is.na(x))
aggregate(......., countna)
> Cheers Herry
>
>
> --------------------------------------------
> Alexander Herr - Herry
> Northern Futures
> Davies Laboratory
> PMB, Aitkenvale, QLD 4814
> Phone (07) 4753 8510
> Fax (07) 4753 8650
> Home: http://batcall.csu.edu.au/~aherr
> CSIRO Sustainable Ecosystems:
> http://www.cse.csiro.au/
> --------------------------------------------
>
>
>
> -----Original Message-----
> From: Uwe Ligges [mailto:ligges at statistik.uni-dortmund.de]
> Sent: Wednesday, 18 December 2002 5:30 PM
> To: Alexander.Herr at csiro.au
> Cc: r-help at stat.math.ethz.ch
> Subject: Re: [R] summary stats including NA's into new dataframe
>
>
> Alexander.Herr at csiro.au wrote:
>
>>List,
>>
>>I am trying to extract summary statistics from a data frame with several
>>variables (and NAs) into a dataframe with the columns: Variablename (ie
>
> the
>
>>colnames of original data), mean, stdev, max, min, Valid N, Missing
>
> Values.
>
>>Extracting the statistics is straightforward using stack and aggregate.
>>However, I haven't succeeded in obtaining the number of Missing Values. I
>>can extract these from describe (Hmisc library), but surely there is a
>>simpler way similar to obtaining the mean using aggregate?
>
>
> The similar way is:
>
> aggregate(......., function(x) sum(is.na(x)))
>
> Uwe Ligges
>
>
>>Suggestions are much appreciated
>
>
>
> [[alternate HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> http://www.stat.math.ethz.ch/mailman/listinfo/r-help
More information about the R-help
mailing list