[BioC] Help with HyperGTest

James W. MacDonald jmacdon at med.umich.edu
Wed Nov 11 19:48:33 CET 2009


Hi Yue,

I see the problem now. The Entrez IDs used by this package are for the 
MG1655 substrain rather than the DH10B substrain.

I don't think it would be a trivial exercise to make your own annotation 
package - this package is built off the ecoliK12.db0 package that Marc 
Carlson creates, so it isn't as simple as using SQLForge to create a new 
package. Creating the db0 packages is non-trivial, so there are no 
facilities (that I know of) for end users to create one.

I don't know the way around this problem, as I am not familiar with the 
different E. coli strains. If there is a one-to-one mapping of genes 
between strains then you might be able to map your Entrez Gene IDs to 
the MG1655 strain and go from there.

Best,

Jim



Yue, Chen - BMD wrote:
> Hi Jim,
>  
> Thanks for your answer. I tried to use your suggestion but it still gave 
> me the same error. How can I know what entrez IDs are used by 
> "org.Eck12.eg.db"? Is there anyway to make my own annotation package for 
> EcK12 substr DH10B? Thanks a lot!
>  
> Regards,
> Chen, Yue
>  
> 
> ------------------------------------------------------------------------
> *From:* James W. MacDonald [mailto:jmacdon at med.umich.edu]
> *Sent:* Tue 11/10/2009 1:19 PM
> *To:* Yue, Chen - BMD
> *Cc:* Marc Carlson; bioconductor at stat.math.ethz.ch
> *Subject:* Re: [BioC] Help with HyperGTest
> 
> Most likely those values are read in as numeric, which won't work. You
> need to convert to character, or use
> 
> targetid <- scan("targetids.txt","c")
> 
> to read in.
> 
> Best,
> 
> Jim
> 
> 
> 
> Yue, Chen - BMD wrote:
>  > Hi Marc and Jim,
>  > 
>  > I'm sorry about the stripped attachment. I listed some targetid and
>  > ecoliid I used. Can you take a look? Thanks!
>  > 
>  > Regards,
>  > 
>  > Chen, Yue
>  > 
>  > <<targetids.txt>>
>  > 6058204
>  > 6058276
>  > 6058499
>  > 6058576
>  > 6058687
>  > 6058820
>  > 6058853
>  > 6058937
>  > 6058989
>  > 6059024
>  > 6059029
>  > 6059123
>  > 
>  > <<ecoliids.txt>>
>  > 6061999
>  > 6061998
>  > 6061997
>  > 6061996
>  > 6061995
>  > 6061994
>  > 6061993
>  > 6061992
>  > 6061991
>  > 6061990
>  > 6061989
>  > 6061988
>  > 6061987
>  > 6061986
>  > 6061985
>  > 6061984
>  > 6061983
>  > 6061982
>  > 6061981
>  > 6061980
>  > 6061979
>  > 6061978
>  > 6061977
>  > 6061976
>  > 6061975
>  > 6061974
>  > 6061973
>  > 6061972
>  > 6061971
>  > 6061970
>  > 6061969
>  > 6061968
>  > 6061967
>  > 
>  >
>  > ------------------------------------------------------------------------
>  > *From:* Marc Carlson [mailto:mcarlson at fhcrc.org]
>  > *Sent:* Tue 11/10/2009 11:39 AM
>  > *To:* Yue, Chen - BMD
>  > *Cc:* bioconductor at stat.math.ethz.ch
>  > *Subject:* Re: [BioC] Help with HyperGTest
>  >
>  > Hi Yue,
>  >
>  > It's a good idea to always give us the output of sessionInfo() when you
>  > post, but I can tell you that with this error, the problem is usually
>  > caused by the input IDs.  If you are using the org.EcK12.eg.db then  you
>  > must use IDs that are Entrez Gene IDs.  What is the output from
>  > head(targetids) and head(ecoliids)?
>  >
>  >   Marc
>  >
>  >
>  >
>  >
>  > Yue, Chen - BMD wrote:
>  >  > Dear All,
>  >  >
>  >  > I hope to get some help on the hyperGTest in GOstats. I want to do an
>  > GO enrichment anlaysis on a set of E. coli K12 genes (substr DH10B). I
>  > attached the target id file, partial ecoli id file (as universeGeneIds)
>  > and sessionInfo to the email. The following is my commands and error. It
>  > seems that my gene id is not found in the annotation package but I don't
>  > know how to find out what gene ids are included in the package. I used
>  > "org.EcK12.eg.db" package which uses Entrez ids and my R version is
>  > 2.9.2 on WinXP. Should I use a different annotation package? Thank you
>  > very much!
>  >  >
>  >  > 
>  >  >> targettable <- read.table("D:/RProjects/targetids.txt")
>  >  >> ecolitable <- read.table("D:/RProjects/ecoliids.txt")
>  >  >> targetids <- unique(targettable[,1])
>  >  >> ecoliids <- unique(ecolitable[,1])
>  >  >> params = new("GOHyperGParams", geneIds=targetids,
>  > universeGeneIds=ecoliids, annotation="org.EcK12.eg.db", ontology="BP",
>  > pvalueCutoff=0.01, conditional=FALSE, testDirection="over")
>  >  >> BPoverTest = hyperGTest(params)
>  >  >>   
>  >  > Error in getUniverseHelper(probes, datPkg, entrezIds) :
>  >  >   After filtering, there are no valid IDs that can be used as the
>  > Gene universe.
>  >  >   Check input values to confirm they are the same type as the central
>  > ID used by your annotation package.
>  >  >   For chip packages, this will still mean the central GENE identifier
>  > used by the package (NOT the probe IDs).
>  >  >
>  >  > Regards,
>  >  >
>  >  > Yue
>  >  >
>  >  >
>  >  >
>  >  > This email is intended only for the use of the individual or entity
>  > to which it is addressed and may contain information that is privileged
>  > and confidential. If the reader of this email message is not the
>  > intended recipient, you are hereby notified that any dissemination,
>  > distribution, or copying of this communication is prohibited. If you
>  > have received this email in error, please notify the sender and
>  > destroy/delete all copies of the transmittal. Thank you.
>  >  >
>  >  > 
>  >  > 
> ------------------------------------------------------------------------
>  >  >
>  >  > _______________________________________________
>  >  > Bioconductor mailing list
>  >  > Bioconductor at stat.math.ethz.ch
>  >  > https://stat.ethz.ch/mailman/listinfo/bioconductor
>  >  > Search the archives:
>  > http://news.gmane.org/gmane.science.biology.informatics.conductor
>  >
>  > 
>  >
>  >
>  > This email is intended only for the use of the individual or entity to
>  > which it is addressed and may contain information that is privileged and
>  > confidential. If the reader of this email message is not the intended
>  > recipient, you are hereby notified that any dissemination, distribution,
>  > or copying of this communication is prohibited. If you have received
>  > this email in error, please notify the sender and destroy/delete all
>  > copies of the transmittal. Thank you.
> 
> --
> James W. MacDonald, M.S.
> Biostatistician
> Douglas Lab
> University of Michigan
> Department of Human Genetics
> 5912 Buhl
> 1241 E. Catherine St.
> Ann Arbor MI 48109-5618
> 734-615-7826
> 
>  
> 
> 
> This email is intended only for the use of the individ...{{dropped:19}}



More information about the Bioconductor mailing list