[BioC] hyperGTest, KEGG
ivan.borozan at utoronto.ca
ivan.borozan at utoronto.ca
Mon Apr 16 16:27:01 CEST 2007
Hi Seth,
Thanks I found one way around my problem (for the non devel version of
the package Category):
geneIds(hgOver)[geneIds(hgOver) %in% hgOver at catToGeneId[[i]]]
where [[i]] runs over significant KEGG terms obtained from hyperGTest(params).
Cheers,
Ivan
Quoting Seth Falcon <sfalcon at fhcrc.org>:
> Hi Ivan,
>
> ivan.borozan at utoronto.ca writes:
>> I've used the script below to calculate over-represented KEGG
>> categories however I can not get to gene ID's associated with each of
>> the overrepresented KEGG terms/pathways ?
>
> I've been working on making results easier to work with and also
> improving the documentation. This is all happening in the devel arm
> (which will soon become the next release).
>
> With a (very) recent version of Category you can get help on all
> accessors for the result objects returned by hyperGTest:
>
> help("HyperGResult-accessors")
>
>> My question, does catToGeneId() exist and how do I get to genes that
>> are associated with each of the above pathways ?
>
> To get the category to universe of gene IDs mapping:
>
> > geneIdUniverse(ans)[1:2]
> $`00625`
> [1] "YCR105W" "YCR107W" "YDL243C" "YDR368W" "YFL056C" "YHR104W"
> "YJR155W"
> [8] "YKR009C" "YNL331C" "YOR120W"
>
> $`04010`
> [1] "YAL041W" "YBL016W" "YBL105C" "YBR083W" "YBR200W" "YCL027W"
> "YDL159W"
> [8] "YDL235C" "YDR103W" "YDR461W" "YDR480W" "YER111C" "YER118C"
> "YFL026W"
> [15] "YGL089C" "YGR032W" "YGR040W" "YGR088W" "YHL007C" "YHR005C"
> "YHR030C"
> [22] "YHR084W" "YIL147C" "YJL095W" "YJL128C" "YJL157C" "YJR086W"
> "YKL062W"
> [29] "YKL178C" "YKR095W" "YLR006C" "YLR113W" "YLR182W" "YLR229C"
> "YLR332W"
> [36] "YLR342W" "YLR362W" "YML004C" "YMR037C" "YMR043W" "YNL053W"
> "YNL098C"
> [43] "YNL145W" "YNL271C" "YNL283C" "YNR031C" "YOL105C" "YOR008C"
> "YOR212W"
> [50] "YOR231W" "YPL049C" "YPL089C" "YPL140C" "YPL187W" "YPR165W"
>
> To get the category to _selected_ gene IDs mapping:
>
> > geneIdsByCategory(ans)[1:2]
> $`00625`
> [1] "YOR120W"
>
> $`04010`
> [1] "YFL026W" "YLR342W"
>
> The number of selected genes in each category (just the length of each
> element of the above):
>
> > geneCounts(ans)[1:2]
> 00625 04010
> 1 2
>
> NOTE: I used the YEAST annotation data package as an example. It is
> non-typical in that it does not use Entrez Gene IDs as the base
> identifier. For your example, you will get Entrez IDs and you can map
> those to SYMBOL if you want using the appropriate annotation data
> package.
>
> The above examples were done using:
>
> R 2.5.0 beta, Category 2.1.36
>
> sessionInfo()
> R version 2.5.0 beta (--)
> powerpc-apple-darwin8.9.0
>
> locale:
> C
>
> attached base packages:
> [1] "splines" "tools" "stats" "graphics" "grDevices" "datasets"
> [7] "utils" "methods" "base"
>
> other attached packages:
> YEAST Category AnnotationDbi RSQLite DBI
> "1.15.13" "2.1.36" "0.0.58" "0.5-4" "0.2-1"
> Matrix lattice genefilter survival annotate
> "0.9975-11" "0.15-3" "1.13.12" "2.31" "1.13.7"
> GO KEGG graph Biobase
> "1.15.13" "1.15.13" "1.13.10" "1.13.48"
>
> Hope that helps.
>
> + seth
>
> --
> Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center
> http://bioconductor.org
>
More information about the Bioconductor
mailing list