[BioC] GOstat question

James W. MacDonald jmacdon at med.umich.edu
Fri Mar 21 13:42:20 CET 2008


Hi Nicolas,

nicolas servant wrote:
> Sorry. I use the BioC GOStats package (2.2.6).

Yes, but you *still* refuse to follow the posting guide, so we are left 
to guess at what you have done. Here's a hint; give us (always) the 
results of calling sessionInfo(), and give us a reproducible example. 
The example you give below is not reproducible, as you are the only one 
who has these data. If we can't do what you have done to see what 
happened, how exactly are we to figure out why you get these results?

And by reproducible example I don't mean for you to give us a link to 
your data. You can simply use set.seed(), randomly select some affy IDs, 
and then go on from there as if these were the IDs you got from an analysis.

> In connection with my last mail, I have a other question :
> 
> ##universe
> designALL<-names(unlist(as.list(hgu133plus2ACCNUM)))
> entrezUniverse <- 
> na.omit(unique(unlist(mget(designALL,hgu133plus2ENTREZID))))
> ##myList
> selectedEntrezIds <- 
> na.omit(unique(unlist(mget(mylist,hgu133plus2ENTREZID))))
> params.BP.over<-new("GOHyperGParams",geneIds=selectedEntrezIds,universeGeneIds=entrezUniverse,annotation="hgu133plus2",ontology="BP",pvalueCutoff=0.05,conditional=TRUE,testDirection="over")
> params.MF.over<-new("GOHyperGParams",geneIds=selectedEntrezIds,universeGeneIds=entrezUniverse,annotation="hgu133plus2",ontology="MF",pvalueCutoff=0.05,conditional=TRUE,testDirection="over")
> params.CC.over<-new("GOHyperGParams",geneIds=selectedEntrezIds,universeGeneIds=entrezUniverse,annotation="hgu133plus2",ontology="CC",pvalueCutoff=0.05,conditional=TRUE,testDirection="over")
> 
> hgOver.BP<-hyperGTest(params.BP.over)
> hgOver.MF<-hyperGTest(params.MF.over)
> hgOver.CC<-hyperGTest(params.CC.over)
> 
> length(selectedEntrezIds)
> [1] 3761
> length(hgOver.BP at geneIds)
> [1] 3122
> 
> I guess the hyperGTest has removed these 639 missing genes. But i don't 
> understand why ?

If you were to peruse the vignettes for the GOstats package, you will 
see that there are two filtering steps that you should take before 
running the analysis. The first is to subset to unique Entrez Gene IDs 
as you have done. The second is to remove those Entrez Gene IDs that 
have no GO annotations associated with them. My best guess is the 639 
missing genes have no GO annotations. But I am guessing since I can't know.

Best,

Jim


> 
> Nicolas
> 
> Robert Gentleman a écrit :
>> Hi Nicolas,
>>
>>    Perhaps you could peruse the posting guide and provide the 
>> information it asks you to.  Only from that could one hope to give you a 
>> reasonable answer (ie, what commands did you run, what what the output, 
>> and your sessionInfo, at a bare minimum).  And the Bioconductor package 
>> is GOstats, not GOstat - that is something else (not Bioconductor) and 
>> if you want help with it you should ask those folks directly.
>>
>>   Robert
>>
>>
>> nicolas servant wrote:
>>   
>>> Hi,
>>>
>>> I have a question about the GOstat package.
>>> I'm working with a list of probesets, and i'm interested by testing the 
>>> over representation of the GO terms in my list.
>>> So, I changed the probesets IDs (of my lists and my Universe) into 
>>> ENTREZ IDs and the hyperGTest performed well.
>>> For instance i have some result as :
>>>
>>> GO:0005635
>>> Pvalue = 0.04
>>> OddsRatio = 0.04
>>> ExpCount =  11
>>> Count = 17
>>> Size = 45
>>>
>>> But, when i did the opposite :
>>>
>>> test<-mget("GO:0005635",hgu133plus2GO2ALLPROBES)
>>> entrez <- unique(unlist(mget(as.vector(unlist(test)),hgu133plus2ENTREZID)))
>>> length(entrez)
>>> [1] 126
>>>
>>> I don't understand why I find 126 entrez IDs in the Universe, and no 45 
>>> as expected (SIZE) ...
>>>
>>> mylistentrez<-unique(intersect(entrez,mylist))
>>> length(mylistentrez)
>>> [1] 50
>>>
>>> In the same way, I find 50 entrez IDs in my list, and no 17 as expected 
>>> (COUNT)
>>>
>>> Thanks for your explanations,
>>> Bests,
>>>
>>> Nicolas
>>>
>>>     
>>   
> 
> 

-- 
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623



More information about the Bioconductor mailing list