[R] strange answer when using 'aggregate()' with a formula

Fox, John jfox at mcmaster.ca
Thu Jan 21 07:52:36 CET 2016


Dear Chel Hee Lee,

With the formula method, the default na.action is na.omit; thus,

> aggregate(y~grp, data=tmp, function(x) sum(is.na(x)), na.action=na.pass)
  grp y
1   2 1
2   3 0

I hope this helps,
 John

-----------------------------
John Fox, Professor
McMaster University
Hamilton, Ontario
Canada L8S 4M4
Web: socserv.mcmaster.ca/jfox


> -----Original Message-----
> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Chel Hee Lee
> Sent: January 21, 2016 5:08 AM
> To: R-help at r-project.org
> Subject: [R] strange answer when using 'aggregate()' with a formula
> 
> Could you kindly test the following codes?  It is because I found strange answer
> when 'aggregate()' is used with a formula.
> 
> I am trying to count how many missing data entries are in each group.
> For this exercise, I created data as below:
> 
>  > tmp <- data.frame(grp=c(2,3,2,3), y=c(NA, 0.5, 3, 0.5))  > tmp
>    grp   y
> 1   2  NA
> 2   3 0.5
> 3   2 3.0
> 4   3 0.5
> 
> I see that observations (variable y) can be grouped into two groups (variable
> grp).  For group 2, y has NA and 3.0.  For group 3, y has 0.5 and 0.5.  Hence, the
> number of missing values is 1 and 0 for group 2 and
> 3, respectively.   This work can be done using 'aggregate()' in the
> 'stats' package as below:
> 
>  > aggregate(x=tmp$y, by=list(grp=tmp$grp), function(x) sum(is.na(x)))
>    grp x
> 1   2 1
> 2   3 0
> 
> A formula can be used as below:
> 
>  > aggregate(y~grp, data=tmp, function(x) sum(is.na(x)))
>    grp y
> 1   2 0
> 2   3 0
> 
> What a surprise!  Is this a bug?  I would appreciate if you share the
> results after testing the codes.   Thank you so much for your helps in
> advance!
> 
> Chel Hee Lee
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list