[R] Multiple count if style "queries"

Chris Mcowen chrismcowen at gmail.com
Sat May 21 00:21:39 CEST 2011


Thanks Phil,

I will investigate how to put it into a exportable object.

Chris
On 20 May 2011, at 23:08, Phil Spector wrote:

Chris -
  It's pretty easy to calculate these counts, the tricky part
is what kind of object to put them in.  Here's a way that returns
a list of lists:

> parts = split(data2,data2$ECO_NAME)
> result  = lapply(parts,function(x)
          list(Order=table(as.character(x$Order)),
               Family=table(as.character(x$Family)),
               Genus=table(as.character(x$Genus))))

So to reproduce your (poorly chosen) example:

> result[['Biak-Numfoor rain forests']]
$Order

Alismatales Asparagales
         1           1

$Family

   Araceae Orchidaceae
         1           1

$Genus

Bromheadia Homalomena
        1          1

or to see each piece individually:

> result[['Biak-Numfoor rain forests']]$Order

Alismatales Asparagales
         1           1 
> result[['Biak-Numfoor rain forests']]$Family

   Araceae Orchidaceae
         1           1 
> result[['Biak-Numfoor rain forests']]$Genus

Bromheadia Homalomena
        1          1

Hope this helps.

					- Phil Spector
					 Statistical Computing Facility
					 Department of Statistics
					 UC Berkeley
					 spector at stat.berkeley.edu




On Fri, 20 May 2011, Chris Mcowen wrote:

> Sorry for not including the data, i did intend to.
> 
> I tried your code but got this error:
> 
>>> answer <- sqldf("
>> +     select ECO_NAME
>> +               , count(distinct Order) as Order
>> +               , count(distinct Family) as Family
>> +               , count(distinct Genus) as Genus
>> +        from data2
>> +        group by ECO_NAME
>> + ")
>> Loading required package: tcltk
>> Loading Tcl/Tk interface ... done
>> Error in sqliteExecStatement(con, statement, bind.data) :
>>  RS-DBI driver: (error in statement: near "Order": syntax error)
> 
> For the second question i would want:
> 
> The number of times each Order/Family and Genus was represented in each eco_region.
> 
> I.E
> 
> Biak-Numfoor rain forests
> Order: Alismatales-1, Araceae-1
> Family: Homalomena -1, Asparagales-1
> Genus: Orchidaceae-1, Bromheadia-1
> 
> So the number after each name is the count.
> 
> Thanks for the help.
> 
> Chris
> 
> 
> data2 <- structure(list(ECO_NAME = structure(c(1L, 1L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L,
> 3L, 3L, 3L, 3L, 3L), .Label = c("Biak-Numfoor rain forests",
> "Central Range montane rain forests", "Huon Peninsula montane rain forests"
> ), class = "factor"), Order = structure(c(1L, 2L, 1L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L,
> 2L, 2L, 2L, 2L, 4L, 4L), .Label = c("Alismatales", "Asparagales",
> "Commelinales", "Poales", "Zingiberales"), class = "factor"),
>   Family = structure(c(1L, 7L, 1L, 6L, 7L, 7L, 7L, 7L, 7L,
>   7L, 7L, 3L, 2L, 4L, 4L, 5L, 8L, 8L, 9L, 9L, 9L, 7L, 7L, 7L,
>   7L, 2L, 8L), .Label = c("Araceae", "Centrolepidaceae", "Commelinaceae",
>   "Cyperaceae", "Eriocaulaceae", "Lomandraceae", "Orchidaceae",
>   "Poaceae", "Zingiberaceae"), class = "factor"), Genus = structure(c(13L,
>   3L, 13L, 6L, 19L, 9L, 20L, 4L, 4L, 9L, 9L, 15L, 5L, 14L,
>   10L, 11L, 17L, 16L, 1L, 8L, 2L, 18L, 7L, 19L, 12L, 5L, 16L
>   ), .Label = c("Alpinia", "Amomum", "Bromheadia", "Bulbophyllum",
>   "Centrolepis", "Cordyline", "Corybas", "Curcuma", "Dendrobium",
>   "Eleocharis", "Eriocaulon", "Glomera", "Homalomena", "Machaerina",
>   "Murdannia", "Poa", "Schizostachyum", "Taeniophyllum", "Thelymitra",
>   "Vanda"), class = "factor")), .Names = c("ECO_NAME", "Order",
> "Family", "Genus"), class = "data.frame", row.names = c(NA, -27L
> ))
> 
> 
> On 20 May 2011, at 22:33, jim holtman wrote:
> 
> use the 'sqldf' package.  Also use 'dput' to include sample data since
> it was impossible to use the data in the format you provided, so my
> guess at a solution would be:
> 
> answer <- sqldf("
>   select ECO_NAME
>             , count(distinct Order) as Order
>             , count(distinct Family) as Family
>             , count(distinct Genus) as Genus
>      from yourData
>      group by ECO_NAME
> ")
> 
> Not sure exactly what you wanted in the second question; it would be
> nice to include what some sample output would be so we have an idea of
> what you are after.
> 
> On Fri, May 20, 2011 at 5:18 PM, Chris Mcowen <chrismcowen at gmail.com> wrote:
>> Dear List,
>> 
>> I am looking to calculate two things from my data frame and was after some advice. For the example below i want to know.
>> 
>> 1. How many unique Orders/Families and Genera there are per eco-name
>> 
>> 2. How many incidences are there for each Order/Family and Genus there are per eco-region
>> 
>> I have 650 econame.
>> 
>> I.e for Biak-Numfoor rain forests there are 2 orders, 2 families and two genera. Also, Alismatales are represented once, Asparagales once etc etc.
>> 
>> Thanks for any advice.
>> 
>> Chris
>> 
>> 
>> ECO_NAME                                                        Order           Family                                  Genus
>> Biak-Numfoor rain forests                               Alismatales             Araceae                         Homalomena
>> Biak-Numfoor rain forests                               Asparagales             Orchidaceae                     Bromheadia
>> Central Range montane rain forests      Alismatales             Araceae                         Homalomena
>> Central Range montane rain forests      Asparagales             Lomandraceae            Cordyline
>> Central Range montane rain forests      Asparagales             Orchidaceae                     Thelymitra
>> Central Range montane rain forests      Asparagales             Orchidaceae                     Dendrobium
>> Central Range montane rain forests      Asparagales             Orchidaceae                     Vanda
>> Central Range montane rain forests      Asparagales             Orchidaceae                     Bulbophyllum
>> Central Range montane rain forests      Asparagales             Orchidaceae                     Bulbophyllum
>> Central Range montane rain forests      Asparagales             Orchidaceae                     Dendrobium
>> Central Range montane rain forests      Asparagales             Orchidaceae                     Dendrobium
>> Central Range montane rain forests      Commelinales    Commelinaceae           Murdannia
>> Central Range montane rain forests      Poales                  Centrolepidaceae        Centrolepis
>> Central Range montane rain forests      Poales                  Cyperaceae                      Machaerina
>> Central Range montane rain forests      Poales                  Cyperaceae                      Eleocharis
>> Central Range montane rain forests      Poales                  Eriocaulaceae           Eriocaulon
>> Central Range montane rain forests      Poales                  Poaceae                 Schizostachyum
>> Central Range montane rain forests      Poales                  Poaceae                 Poa
>> Central Range montane rain forests      Zingiberales            Zingiberaceae           Alpinia
>> Central Range montane rain forests      Zingiberales            Zingiberaceae           Curcuma
>> Central Range montane rain forests      Zingiberales            Zingiberaceae           Amomum
>> Huon Peninsula montane rain forests     Asparagales             Orchidaceae                     Taeniophyllum
>> Huon Peninsula montane rain forests     Asparagales             Orchidaceae                     Corybas
>> Huon Peninsula montane rain forests     Asparagales             Orchidaceae                     Thelymitra
>> Huon Peninsula montane rain forests     Asparagales             Orchidaceae                     Glomera
>> Huon Peninsula montane rain forests     Poales                  Centrolepidaceae        Centrolepis
>> Huon Peninsula montane rain forests     Poales                  Poaceae                 Poa
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
> 
> 
> 
> -- 
> Jim Holtman
> Data Munger Guru
> 
> What is the problem that you are trying to solve?
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 



More information about the R-help mailing list