[BioC] List significant genes on a GO table

Quentin Anstee q.anstee at imperial.ac.uk
Thu Feb 23 11:57:08 CET 2006


Hi Jim,

That helps a great deal. Thank you very much.

Best wishes,

Quentin 

> -----Original Message-----
> From: James W. MacDonald [mailto:jmacdon at med.umich.edu] 
> Sent: 21 February 2006 15:31
> To: Quentin Anstee
> Cc: bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] List significant genes on a GO table
> 
> Hi Quentin,
> 
> Quentin Anstee wrote:
> > Dear List,
> >  
> > Can anyone advise me how to add a list of significant genes onto a 
> > gene ontology table so that I can see which of my differentially 
> > expressed genes belong to a given GO group?
> >  
> > I would like to be able to output a table that looks like:
> >  
> > GO_id                         Description                   
>       p-value
> > #Genes    Gene_ids/symbols
> > GO:12345                  Glucose Metabolism          
> 0.0001            34
> > IDs of the *significant* probes from the affy chip that are 
> in this GO 
> > pathway.
> 
> You can output tables like this using hyperGtable() in the 
> affycoretools package. The last column of your table will be 
> a bit messy because there will be variable numbers of Affy 
> IDs. I prefer a two step approach; do the above table, and 
> then output the probesets for each row (e.g., each 
> significant GO term) in individual HTML or text tables using 
> hyperG2annaffy(), which is also in affycoretools.
> 
> Note that affycoretools is in the devel repository, so you 
> need R-2.3.0dev to automatically download using e.g., 
> biocLite(). However, there is no dependency on R-2.3.0dev, so 
> you can download from the website and install by hand into 
> any reasonably recent version of R.
> 
> HTH,
> 
> Jim
> 
> 
> 
> > Having read the vignettes I have been able to generate most of this 
> > table but not the last column containing the Affy_Ids (or 
> ideally gene 
> > symbols). I would be very grateful if someone could help me 
> out with 
> > this. The script I have used so far is attached.
> >  
> > Many thanks,
> >  
> > Quentin
> >  
> > 1. LOAD GENE EXPRESSION ANALYSIS DATA
> > =========================================================
> >  
> > a. This is a three way comparison. Data is normalised, filtered, 
> > limma/eBayes to give a MArrayLM package called fit2.
> >  
> > 2. LOAD LIBRARIES
> > =========================================================
> >  
> > library(GO)
> > library(GOstats)
> > library(annotate)
> > library(simpleaffy)
> > library(genefilter)
> > library(multtest)
> > library(affy)
> > library(limma)
> > library(gcrma)
> > library(xtable)
> > library(mouse4302)
> > library(mouse4302cdf)
> > library(annaffy)
> > library(Rgraphviz)
> >  
> > 3. MAKE COMPARISONS FOR DFFERENTIAL EXPRESSION 
> > ==========================================================
> > # B-CONTROL
> > tab<-topTable(fit2,coef=1)
> > # A-CONTROL
> > tab<-topTable(fit2,coef=2)
> > # A-B
> > tab<-topTable(fit2,coef=3)    
> >  
> > # topTable contains a a default multadjust
> >  
> > 4. Do GO ANALYSIS, MAKE FIGURE & MAKE TABLE 
> > ==========================================================
> > gn<-as.character(tab$ID)
> > gn
> > LLID<-unlist(mget(gn,mouse4302LOCUSID,ifnotfound=NA))
> > go<-makeGOGraph(as.character(LLID),"CC",removeRoot=FALSE)
> > go
> >  
> > # There are 3 choices for ontology: "MF", "BP" and "CC"
> >  
> > a. Plot Graphic
> > ----------------------------------------------------------
> > att<-list()
> > lab<-rep(nodes(go),length(nodes(go)))
> > names(lab)<-nodes(go)
> > att$label<-lab
> > plot(go,nodeAttrs=att)
> >  
> > # Are there more genes at one GO than expected?
> > ----------------------------------------------------------
> > hyp<-GOHyperG(unique(LLID),lib="mouse4302",what="CC")
> > names(hyp)
> > go.pv<-hyp$pvalues[nodes(go)]
> > go.pv<-sort(go.pv)
> >  
> > b. Create Table
> > ----------------------------------------------------------
> > sig<-go.pv[go.pv<0.05]
> > counts<-hyp$goCounts[names(sig)]
> > terms<-getGOTerm(names(sig))[["CC"]]
> > nch<-nchar(unlist(terms))
> > terms2<-substr(unlist(terms),1,50)
> > terms3<-paste(terms2,ifelse(nch>50,"...",""),sep="")
> > 
> mat<-matrix(c(names(terms),terms3,round(sig,3),counts),ncol=4,dimnames
> > =list( 1:length(sig),c("GO ID","Term","p-value","# Genes"))) mat
> > write.table(mat,"A_B_GO-Table_CC.txt")
> > 
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> 
> 
> --
> James W. MacDonald, M.S.
> Biostatistician
> Affymetrix and cDNA Microarray Core
> University of Michigan Cancer Center
> 1500 E. Medical Center Drive
> 7410 CCGC
> Ann Arbor MI 48109
> 734-647-5623
>



More information about the Bioconductor mailing list