[R] data manipulation involving aggregate
Gabor Grothendieck
ggrothendieck at gmail.com
Fri May 29 16:46:27 CEST 2009
Try this:
> as.data.frame.table(xtabs(area ~ habitat + sq, DF), responseName = "area.sum")[c(2:3, 1)]
sq area.sum habitat
1 1 0 field
2 1 3 garden
3 1 3 pond
4 1 0 river
5 2 1 field
6 2 2 garden
7 2 0 pond
8 2 0 river
9 3 5 field
10 3 1 garden
11 3 0 pond
12 3 3 river
On Fri, May 29, 2009 at 10:27 AM, Simon Pickett <simon.pickett at bto.org> wrote:
> hi all,
>
> I often have a data frame like this example
>
> data.frame(sq=c(1,1,1,2,2,3,3,3,3),area=c(1,2,3,1,2,3,1,2,3),habitat=c("garden","garden","pond","field","garden","river","garden","field","field"))
>
> for each "sq" I have multiple "habitat"s each with an associated "area".
>
> I want to aggregate the data frame so that for each "sq" I have a column of all possible "habitat"s and another column for the calculation of the summed areas for each "habitat". If a certain habitat doesnt exist in that square I want a zero, like this..
>
> data.frame(sq=rep(seq(1:3),each=4),area.sum=c(3,3,0,0,2,0,1,0,1,0,5,3),habitat=rep(c("garden","pond","field","river") ))
>
> Is there an eloquent, efficient way of doing this? My solution involves lots of intermediate aggregated data frames, one for each habitat, then a series of merges onto a bigger data frame.
>
> Thanks peeps and have a good weekend,
>
> Simon.
>
>
>
>
>
> Dr. Simon Pickett
> Research Ecologist
> Land Use Department
> Terrestrial Unit
> British Trust for Ornithology
> The Nunnery
> Thetford
> Norfolk
> IP242PU
> 01842750050
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list