[BioC] Annotation.db: how automatically call a mapping?
Hooiveld, Guido
Guido.Hooiveld at wur.nl
Tue Jun 30 23:01:27 CEST 2009
Hi again,
Some remarks/questions:
First of all, thanks for all your help!
Meanwhile I got limma2annaffy to work. Nice HTML output! However some
minor things:
- I experienced that the library 'annaffy' is not automatically loaded.
- how to get the scientific notation for the p-values (e.g. 1.24E-12;
for all 30 genes it is now "0").
- in the output file, the header 'Fold Change' actually reflects
'log2FC'.
Regarding the function probes2table, how to access the results [...html
= FALSE, filename = "output")]?
Is it (=output) automatically saved? If so, where?; it is not in my
working directory.
Thanks,
Guido
> limma2annaffy(eset, fit2, design, cont.matrix, "moe430a", adjust =
"fdr",
anncols = aaf.handler()[c(1:3, 6:7, 9:12)], number = 30, pfilt = NULL,
fldfilt= NULL,tstat = TRUE, pval = TRUE, FC = TRUE, expression = TRUE,
html = TRUE, text = FALSE, save = FALSE, addname = NULL, interactive =
TRUE)
You are going to output 3 tables,
With this many genes in each:
30 30 30
Do you want to accept or change these values? [ a/c ]a
Error in as.vector(x) : could not find function "aaf.handler"
> library(annaffy)
> limma2annaffy(eset, fit2, design, cont.matrix, "moe430a", adjust =
"fdr",
anncols = aaf.handler()[c(1:3, 6:7, 9:12)], number = 30, pfilt = NULL,
fldfilt= NULL,tstat = TRUE, pval = TRUE, FC = TRUE, expression = TRUE,
html = TRUE, text = FALSE, save = FALSE, addname = NULL, interactive =
TRUE)
You are going to output 3 tables,
With this many genes in each:
30 30 30
Do you want to accept or change these values? [ a/c ]a
> -----Original Message-----
> From: James W. MacDonald [mailto:jmacdon at med.umich.edu]
> Sent: 30 June 2009 22:29
> To: Hooiveld, Guido
> Cc: bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] Annotation.db: how automatically call a mapping?
>
>
>
> Hooiveld, Guido wrote:
> > Hi Jim,
> > Since your suggestion looks indeed easy and seems to provide
> > everything I would like to have, a gave it a try (I'll have
> a further
> > look at Martin's comments/suggestions later).
> > However, it seems that 'probes2table' cannot properly
> handle multiple
> > contrast defined in limma:
> >
> >> library(affycoretools)
> > Loading required package: GO.db
> > Loading required package: DBI
> > Loading required package: KEGG.db
> >> probes2table(eset, featureNames(eset), annotation(eset),
> > list("p-value"= fit2$p.value, "mean" = fit2$Amean), html = FALSE,
> > filename = "output") Loading required package: moe430a.db Error in
> > aafTable(items = otherdata) :
> > All columns must be of equal length
> >> length(fit2$p.value)
> > [1] 68070
> >> length(fit2$Amean)
> > [1] 22690
> >
> > I analyzed three contrasts in limma, and 68070/3 is 22690, which
> > exactly equals the number of probesets on the Affy MOE430A
> array. This
> > thus explains the error.
> >
> > Question: can this easily be solved? Can limma2annaffy
> handle multiple
> > contrasts? (at the moment I am not familiar with
> affycoretools at all).
>
> limma2annaffy(eset, fit2, <designmatrixname>,
> <contrastsmatrixname>, annotation(eset), pfilt=1, html =
> FALSE, interactive = FALSE)
>
> should give you text tables for each contrast. I would
> normally use a reasonable p-value and possibly a fold filter
> to reduce the output to the reporters of interest, and use
> html = TRUE because my end users in general like that better
> than the text. However, for all reporters on this chip the
> HTML table is too huge to be of use.
>
> >
> > Thanks,
> > Guido
> >
> >
> >
> >> -----Original Message-----
> >> From: bioconductor-bounces at stat.math.ethz.ch
> >> [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf
> Of James W.
> >> MacDonald
> >> Sent: 30 June 2009 19:44
> >> To: Hooiveld, Guido
> >> Cc: bioconductor at stat.math.ethz.ch
> >> Subject: Re: [BioC] Annotation.db: how automatically call
> a mapping?
> >>
> >> Hi Guido,
> >>
> >> Hooiveld, Guido wrote:
> >>> Hi Martin,
> >>>
> >>> Indeed, another useful, straigh-forward possibility for mapping.
> >>> However, I am now facing the problem of properly combining the
> >>> annotation info with the expression data. This is what I
> >> would like to
> >>> do:
> >>>
> >>>> Tab_data <- exprs(eset[probeids])
> >>>> Tab_data <- cbind(Tab_data, fit2$Amean) # to add average
> >> expression
> >>>> of
> >>> LIMMA output
> >>>> Tab_data <- cbind(Tab_data, fit2$p.value) # to add
> p-value of LIMMA
> >>> output
> >>> etc.
> >>>
> >>> This al goes fine, however adding the annotation info
> >> 'mixes-up' the
> >>> content of Tab_data; the annotation data replaces the first
> >> column of
> >>> Tab_data, and the content of all cells is replaced by 'null'. I
> >>> suspect it has something to do with the type of object I
> >> would like to
> >>> merge, but I am not sure.
> >>>
> >>>> map.entrez <- getAnnMap("ENTREZID", annotation(eset))
> >> map.entrez <-
> >>>> as.list(map.entrez[probeids])
> >> This sort of thing is going to get really difficult to do by hand
> >> when you get to things that have a one-to-many
> relationship. And you
> >> are already duplicating existing efforts with what you
> have done so
> >> far.
> >>
> >> If you want to combine annotation data with results data,
> you really
> >> want to be using the annaffy package which does lots of
> these things
> >> seamlessly. And if you want things to be a bit easier, you could
> >> consider using the affycoretools package as well, which
> for the most
> >> part uses annaffy to create output.
> >>
> >> You can do what you appear to want in one line:
> >>
> >> probes2table(eset, featureNames(eset), annotation(eset),
> >> list("p-value"= fit2$p.value, "mean" = fit2$Amean), html = FALSE,
> >> filename = "output")
> >>
> >> You might need to play around with the 'anncols' argument
> to get what
> >> annotation data you might want.
> >>
> >> If you want output specific to the contrasts you have fit, see
> >> ?limma2annaffy.
> >>
> >> Best,
> >>
> >> Jim
> >>
> >>
> >>
> >>>
> >>>> Tab_data <- cbind(Tab_data, map.entrez)
> >>> ^ in R this seems to work, but when saved as .txt the
> content of
> >>> Tab_data is completely mixed up. Before 'adding' map.entrez
> >> Tab_dat is
> >>> OK.
> >>>
> >>>
> >>>> write.table(cbind(rownames(Tab_data2), Tab_data2),
> >>> file="test_1234.txt", sep="\t", col.names=TRUE, row.names=FALSE)
> >>>
> >>>> class(Tab_data)
> >>> [1] "matrix"
> >>>> class(map.entrez)
> >>> [1] "list"
> >>>
> >>>
> >>> Do you, or someone elsr, have a suggestion how to properly
> >> link these
> >>> two types of data?
> >>> Thanks again,
> >>> Guido
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: bioconductor-bounces at stat.math.ethz.ch
> >>>> [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf
> >> Of Martin
> >>>> Morgan
> >>>> Sent: 30 June 2009 00:00
> >>>> To: Hooiveld, Guido
> >>>> Cc: bioconductor at stat.math.ethz.ch
> >>>> Subject: Re: [BioC] Annotation.db: how automatically call
> >> a mapping?
> >>>> Hooiveld, Guido wrote:
> >>>>> Hi,
> >>>>>
> >>>>> I am facing a problem i cannot solve myselves, despite
> >> everything i
> >>>>> read/know. But i assume the solution is easy for the more
> >>>> knowledgable
> >>>>> folks in BioC/R...
> >>>>>
> >>>>> This does work:
> >>>>>> library(moe430a.db)
> >>>>>> xxyy <- moe430aSYMBOL
> >>>>>> xxyy
> >>>>> SYMBOL map for chip moe430a (object of class "AnnDbBimap")
> >>>>>
> >>>>> However, for this to work you need to know the array type
> >>>> of the data
> >>>>> that is analyzed.
> >>>>>
> >>>>>
> >>>>> Now i would like to automatically extract the (e.g.)
> >> SYMBOL mapping
> >>>>> from an annotation.db, thus by retrieving the array type
> >>>> from the eset.
> >>>>>
> >>>>>> library(affy)
> >>>>>> eset <- rma(data)
> >>>>>> probeids <- featureNames(eset)
> >>>>>> annotation(eset)
> >>>>> [1] "moe430a"
> >>>>>
> >>>>> But how can i use this info to properly call the SYMBOL mapping?
> >>>> Hi Guido --
> >>>>
> >>>> to get the appropriate map
> >>>>
> >>>> library(annotate)
> >>>> map = getAnnMap("SYMBOL", annotation(eset))
> >>>>
> >>>> to select just the relevant probes
> >>>>
> >>>> map[probeids]
> >>>>
> >>>> toTable(map[probeids]) or as.list(map[probeids]) might
> be the next
> >>>> step in the work flow.
> >>>>
> >>>> Martin
> >>>>
> >>>>>
> >>>>> I tried this:
> >>>>>> arraytype <- annotation(eset)
> >>>>>> arraytype <- paste(arraytype, "db", sep = ".") arraytype
> >>>>> [1] "moe430a.db"
> >>>>>> arraytype <- paste("package", arraytype, sep = ":") gh <-
> >>>>>> ls(arraytype) gh
> >>>>> [1] "moe430a" "moe430a_dbconn"
> >> "moe430a_dbfile"
> >>>>> "moe430a_dbInfo" "moe430a_dbschema" "moe430aACCNUM"
> >>>>> "moe430aALIAS2PROBE" "moe430aCHR"
> "moe430aCHRLENGTHS"
> >>>>> "moe430aCHRLOC"
> >>>>> [11] "moe430aCHRLOCEND" "moe430aENSEMBL"
> >>>>> "moe430aENSEMBL2PROBE" "moe430aENTREZID" "moe430aENZYME"
> >>>>> "moe430aENZYME2PROBE" "moe430aGENENAME" "moe430aGO"
> >>>>> "moe430aGO2ALLPROBES" "moe430aGO2PROBE"
> >>>>> [21] "moe430aMAP" "moe430aMAPCOUNTS" "moe430aMGI"
> >>>>> "moe430aMGI2PROBE" "moe430aORGANISM" "moe430aPATH"
> >>>>> "moe430aPATH2PROBE" "moe430aPFAM" "moe430aPMID"
> >>>>> "moe430aPMID2PROBE"
> >>>>> [31] "moe430aPROSITE" "moe430aREFSEQ"
> "moe430aSYMBOL"
> >>>>> "moe430aUNIGENE" "moe430aUNIPROT"
> >>>>>
> >>>>>> gh[33]
> >>>>> [1] "moe430aSYMBOL"
> >>>>>> symbols <- mget(probeids, gh[33])
> >>>>> Error in mget(probeids, gh[33]) : second argument must be an
> >>>>> environment
> >>>>>
> >>>>> This also doesn't work:
> >>>>>> symbols <- mget(probeids, envir=gh[33])
> >>>>> Error in mget(probeids, envir = gh[33]) :
> >>>>> second argument must be an environment
> >>>>>
> >>>>> My approach thus is the wrong approach to automatically extract
> >>>>> mappings from a annotation.db.
> >>>>> Since i don't know about any other possibility, i would
> >>>> appreciate if
> >>>>> someone could point me to a working solution.
> >>>>>
> >>>>> Thanks,
> >>>>> Guido
> >>>>>
> >>>>>
> >>>>> ------------------------------------------------
> >>>>> Guido Hooiveld, PhD
> >>>>> Nutrition, Metabolism & Genomics Group Division of Human
> >> Nutrition
> >>>>> Wageningen University Biotechnion, Bomenweg 2
> >>>>> NL-6703 HD Wageningen
> >>>>> the Netherlands
> >>>>> tel: (+)31 317 485788
> >>>>> fax: (+)31 317 483342
> >>>>> internet: http://nutrigene.4t.com <http://nutrigene.4t.com/>
> >>>>> email: guido.hooiveld at wur.nl
> >>>>>
> >>>>>
> >>>>>
> >>>>> [[alternative HTML version deleted]]
> >>>>>
> >>>>> _______________________________________________
> >>>>> Bioconductor mailing list
> >>>>> Bioconductor at stat.math.ethz.ch
> >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >>>>> Search the archives:
> >>>>>
> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >>>> _______________________________________________
> >>>> Bioconductor mailing list
> >>>> Bioconductor at stat.math.ethz.ch
> >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >>>> Search the archives:
> >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >>>>
> >>>>
> >>> _______________________________________________
> >>> Bioconductor mailing list
> >>> Bioconductor at stat.math.ethz.ch
> >>> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >>> Search the archives:
> >>> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >> --
> >> James W. MacDonald, M.S.
> >> Biostatistician
> >> Douglas Lab
> >> University of Michigan
> >> Department of Human Genetics
> >> 5912 Buhl
> >> 1241 E. Catherine St.
> >> Ann Arbor MI 48109-5618
> >> 734-615-7826
> >>
> >> _______________________________________________
> >> Bioconductor mailing list
> >> Bioconductor at stat.math.ethz.ch
> >> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >> Search the archives:
> >> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >>
> >>
> >
>
> --
> James W. MacDonald, M.S.
> Biostatistician
> Douglas Lab
> University of Michigan
> Department of Human Genetics
> 5912 Buhl
> 1241 E. Catherine St.
> Ann Arbor MI 48109-5618
> 734-615-7826
>
>
More information about the Bioconductor
mailing list