[BioC] GOHyperG for KEGG

Shi, Tao shidaxia at yahoo.com
Tue Sep 6 19:27:05 CEST 2005


Thank you very much, Gunnar.  I'll try that.

At same time, I wrote a function by myself, which I totally stole from GOHyperG.  Just want to
share it with everybody.  Please let me know if there are any bugs!

...Tao

=============================================================================

KEGGHyperG <-
function (geneIDs, lib = "hgu95av2") {
    getDataEnv <- function(name, lib) {
        get(paste(lib, name, sep = ""), mode = "environment")
    }
    require(lib, character.only = TRUE) || stop("need data package", lib)
    if (any(duplicated(geneIDs)))  stop("input IDs must be unique")
    keggV <- as.list(getDataEnv("PATH2PROBE", lib))
    
    whWeHave <- sapply(keggV, function(y) {
        if (is.na(y) || length(y) == 0) 
            return(FALSE)
        ids = unique(unlist(y))
        any(geneIDs %in% ids)
    })

    keggV <- keggV[whWeHave]
    keggV <- sapply(keggV, function(x) {
        if(any(grep("AFFX",x))) { 
            return(x[-grep("AFFX",x)])
        } else {
            return(x)
        }
    } ) ## get rid of control probes

    bad <- sapply(keggV, function(x) (length(x) == 1 && is.na(x)))
    keggV <- keggV[!bad]
    cIDs <- unique(unlist(keggV))
    nIDs <- length(cIDs)
    keggCounts <- sapply(keggV, length)
    ourIDs <- unique(geneIDs[!is.na(geneIDs)])
    ours <- ourIDs[!duplicated(ourIDs)]
    whGood <- ours[ours %in% cIDs]
    
    nInt = length(whGood)
    if (nInt == 0)  { warning("no interesting genes found") }
    useCts <- sapply(keggV, function(x) sum(whGood %in% x))
    
    pvs <- phyper(useCts - 1, nInt, nIDs - nInt, keggCounts, lower.tail = FALSE)
    ord <- order(pvs)
    return(list(pvalues = pvs[ord], keggCounts = keggCounts[ord], 
        chip = lib, kegg2Affy = keggV, intCounts = useCts[ord], numIDs = nIDs, 
        numInt = nInt, intIDs = geneIDs))
}
==================================================================================




--- Gunnar Wrobel <bioc at gunnarwrobel.de> wrote:

> > Is there a similar function like GOHyperG that works on KEGG?  It seems there is no such thing
> > back in Feb. 05 (https://stat.ethz.ch/pipermail/bioconductor/2005-February/007532.html).  Any
> > updates?
> Hi Tao,
> 
> you might try to do this with goCluster. It does the same kind of
> calculation as GOHyperG but can use any kind of annotation.
> 
> Cheers
> 
> Gunnar
>



More information about the Bioconductor mailing list