[BioC] reasonable Illumina hyperG test
Sebastien Gerega
seb at gerega.net
Fri Sep 5 07:18:00 CEST 2008
Hi,
I have been looking around at examples of the hyperGTest (in the
GOstats, lumi, and other documentation) and feel like I have seen many
slight variations on the methodology.
These variations are usually found in the way the non-specific filtering
is performed. I haven't come across many examples of a hyperGTest for
KEGG pathways and would like to ask whether my approach seems reasonable
or whether I should make any changes.
Here is my code ("sig" is a vector of EntrezID):
uni = exprs(lumi.N.P)
#Remove those without PATH annotation
havePATH = sapply(mget(allFeatures, lumiHumanAllPATH),
function(x){
if (length(x) == 1 && is.na(x))
FALSE
else TRUE
})
uni <- uni[names(which(havePATH == TRUE)),]
#Remove those with little variation accross samples
iqrCutoff = 0.5
uni.IQR = apply(uni, 1, IQR)
uni = uni[which((uni.IQR > iqrCutoff) == TRUE),]
#Keep probes w/largest IQR
uni = uni[findLargest(rownames(uni), uni.IQR[rownames(uni)],
"lumiHumanAll"),]
uni = mget(rownames(uni), lumiHumanAllENTREZID)
params = new("KEGGHyperGParams", geneIds=sig, universeGeneIds = uni,
annotation="lumiHumanAll", pvalueCutoff=0.05, testDirection="over")
hgOver = hyperGTest(params)
Does this code/approach seem reasonable? Should I correct for multiple
testing after the hyperGTest?
Would it be fair to perform a test on gene ontologies in teh same way
(obviously after having changed the param type and specifying an
ontology branch)?
thanks,
Sebastien
More information about the Bioconductor
mailing list