[BioC] Annotation.db: how automatically call a mapping?
James W. MacDonald
jmacdon at med.umich.edu
Tue Jun 30 22:28:46 CEST 2009
Hooiveld, Guido wrote:
> Hi Jim,
> Since your suggestion looks indeed easy and seems to provide everything
> I would like to have, a gave it a try (I'll have a further look at
> Martin's comments/suggestions later).
> However, it seems that 'probes2table' cannot properly handle multiple
> contrast defined in limma:
>
>> library(affycoretools)
> Loading required package: GO.db
> Loading required package: DBI
> Loading required package: KEGG.db
>> probes2table(eset, featureNames(eset), annotation(eset),
> list("p-value"= fit2$p.value, "mean" = fit2$Amean), html = FALSE,
> filename = "output")
> Loading required package: moe430a.db
> Error in aafTable(items = otherdata) :
> All columns must be of equal length
>> length(fit2$p.value)
> [1] 68070
>> length(fit2$Amean)
> [1] 22690
>
> I analyzed three contrasts in limma, and 68070/3 is 22690, which exactly
> equals the number of probesets on the Affy MOE430A array. This thus
> explains the error.
>
> Question: can this easily be solved? Can limma2annaffy handle multiple
> contrasts? (at the moment I am not familiar with affycoretools at all).
limma2annaffy(eset, fit2, <designmatrixname>, <contrastsmatrixname>,
annotation(eset), pfilt=1, html = FALSE, interactive = FALSE)
should give you text tables for each contrast. I would normally use a
reasonable p-value and possibly a fold filter to reduce the output to
the reporters of interest, and use html = TRUE because my end users in
general like that better than the text. However, for all reporters on
this chip the HTML table is too huge to be of use.
>
> Thanks,
> Guido
>
>
>
>> -----Original Message-----
>> From: bioconductor-bounces at stat.math.ethz.ch
>> [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of
>> James W. MacDonald
>> Sent: 30 June 2009 19:44
>> To: Hooiveld, Guido
>> Cc: bioconductor at stat.math.ethz.ch
>> Subject: Re: [BioC] Annotation.db: how automatically call a mapping?
>>
>> Hi Guido,
>>
>> Hooiveld, Guido wrote:
>>> Hi Martin,
>>>
>>> Indeed, another useful, straigh-forward possibility for mapping.
>>> However, I am now facing the problem of properly combining the
>>> annotation info with the expression data. This is what I
>> would like to
>>> do:
>>>
>>>> Tab_data <- exprs(eset[probeids])
>>>> Tab_data <- cbind(Tab_data, fit2$Amean) # to add average
>> expression
>>>> of
>>> LIMMA output
>>>> Tab_data <- cbind(Tab_data, fit2$p.value) # to add p-value of LIMMA
>>> output
>>> etc.
>>>
>>> This al goes fine, however adding the annotation info
>> 'mixes-up' the
>>> content of Tab_data; the annotation data replaces the first
>> column of
>>> Tab_data, and the content of all cells is replaced by 'null'. I
>>> suspect it has something to do with the type of object I
>> would like to
>>> merge, but I am not sure.
>>>
>>>> map.entrez <- getAnnMap("ENTREZID", annotation(eset))
>> map.entrez <-
>>>> as.list(map.entrez[probeids])
>> This sort of thing is going to get really difficult to do by
>> hand when you get to things that have a one-to-many
>> relationship. And you are already duplicating existing
>> efforts with what you have done so far.
>>
>> If you want to combine annotation data with results data, you
>> really want to be using the annaffy package which does lots
>> of these things seamlessly. And if you want things to be a
>> bit easier, you could consider using the affycoretools
>> package as well, which for the most part uses annaffy to
>> create output.
>>
>> You can do what you appear to want in one line:
>>
>> probes2table(eset, featureNames(eset), annotation(eset),
>> list("p-value"= fit2$p.value, "mean" = fit2$Amean), html =
>> FALSE, filename = "output")
>>
>> You might need to play around with the 'anncols' argument to
>> get what annotation data you might want.
>>
>> If you want output specific to the contrasts you have fit,
>> see ?limma2annaffy.
>>
>> Best,
>>
>> Jim
>>
>>
>>
>>>
>>>> Tab_data <- cbind(Tab_data, map.entrez)
>>> ^ in R this seems to work, but when saved as .txt the content of
>>> Tab_data is completely mixed up. Before 'adding' map.entrez
>> Tab_dat is
>>> OK.
>>>
>>>
>>>> write.table(cbind(rownames(Tab_data2), Tab_data2),
>>> file="test_1234.txt", sep="\t", col.names=TRUE, row.names=FALSE)
>>>
>>>> class(Tab_data)
>>> [1] "matrix"
>>>> class(map.entrez)
>>> [1] "list"
>>>
>>>
>>> Do you, or someone elsr, have a suggestion how to properly
>> link these
>>> two types of data?
>>> Thanks again,
>>> Guido
>>>
>>>
>>>
>>>
>>>
>>>> -----Original Message-----
>>>> From: bioconductor-bounces at stat.math.ethz.ch
>>>> [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf
>> Of Martin
>>>> Morgan
>>>> Sent: 30 June 2009 00:00
>>>> To: Hooiveld, Guido
>>>> Cc: bioconductor at stat.math.ethz.ch
>>>> Subject: Re: [BioC] Annotation.db: how automatically call
>> a mapping?
>>>> Hooiveld, Guido wrote:
>>>>> Hi,
>>>>>
>>>>> I am facing a problem i cannot solve myselves, despite
>> everything i
>>>>> read/know. But i assume the solution is easy for the more
>>>> knowledgable
>>>>> folks in BioC/R...
>>>>>
>>>>> This does work:
>>>>>> library(moe430a.db)
>>>>>> xxyy <- moe430aSYMBOL
>>>>>> xxyy
>>>>> SYMBOL map for chip moe430a (object of class "AnnDbBimap")
>>>>>
>>>>> However, for this to work you need to know the array type
>>>> of the data
>>>>> that is analyzed.
>>>>>
>>>>>
>>>>> Now i would like to automatically extract the (e.g.)
>> SYMBOL mapping
>>>>> from an annotation.db, thus by retrieving the array type
>>>> from the eset.
>>>>>
>>>>>> library(affy)
>>>>>> eset <- rma(data)
>>>>>> probeids <- featureNames(eset)
>>>>>> annotation(eset)
>>>>> [1] "moe430a"
>>>>>
>>>>> But how can i use this info to properly call the SYMBOL mapping?
>>>> Hi Guido --
>>>>
>>>> to get the appropriate map
>>>>
>>>> library(annotate)
>>>> map = getAnnMap("SYMBOL", annotation(eset))
>>>>
>>>> to select just the relevant probes
>>>>
>>>> map[probeids]
>>>>
>>>> toTable(map[probeids]) or as.list(map[probeids]) might be the next
>>>> step in the work flow.
>>>>
>>>> Martin
>>>>
>>>>>
>>>>> I tried this:
>>>>>> arraytype <- annotation(eset)
>>>>>> arraytype <- paste(arraytype, "db", sep = ".") arraytype
>>>>> [1] "moe430a.db"
>>>>>> arraytype <- paste("package", arraytype, sep = ":") gh <-
>>>>>> ls(arraytype) gh
>>>>> [1] "moe430a" "moe430a_dbconn"
>> "moe430a_dbfile"
>>>>> "moe430a_dbInfo" "moe430a_dbschema" "moe430aACCNUM"
>>>>> "moe430aALIAS2PROBE" "moe430aCHR" "moe430aCHRLENGTHS"
>>>>> "moe430aCHRLOC"
>>>>> [11] "moe430aCHRLOCEND" "moe430aENSEMBL"
>>>>> "moe430aENSEMBL2PROBE" "moe430aENTREZID" "moe430aENZYME"
>>>>> "moe430aENZYME2PROBE" "moe430aGENENAME" "moe430aGO"
>>>>> "moe430aGO2ALLPROBES" "moe430aGO2PROBE"
>>>>> [21] "moe430aMAP" "moe430aMAPCOUNTS" "moe430aMGI"
>>>>> "moe430aMGI2PROBE" "moe430aORGANISM" "moe430aPATH"
>>>>> "moe430aPATH2PROBE" "moe430aPFAM" "moe430aPMID"
>>>>> "moe430aPMID2PROBE"
>>>>> [31] "moe430aPROSITE" "moe430aREFSEQ" "moe430aSYMBOL"
>>>>> "moe430aUNIGENE" "moe430aUNIPROT"
>>>>>
>>>>>> gh[33]
>>>>> [1] "moe430aSYMBOL"
>>>>>> symbols <- mget(probeids, gh[33])
>>>>> Error in mget(probeids, gh[33]) : second argument must be an
>>>>> environment
>>>>>
>>>>> This also doesn't work:
>>>>>> symbols <- mget(probeids, envir=gh[33])
>>>>> Error in mget(probeids, envir = gh[33]) :
>>>>> second argument must be an environment
>>>>>
>>>>> My approach thus is the wrong approach to automatically extract
>>>>> mappings from a annotation.db.
>>>>> Since i don't know about any other possibility, i would
>>>> appreciate if
>>>>> someone could point me to a working solution.
>>>>>
>>>>> Thanks,
>>>>> Guido
>>>>>
>>>>>
>>>>> ------------------------------------------------
>>>>> Guido Hooiveld, PhD
>>>>> Nutrition, Metabolism & Genomics Group Division of Human
>> Nutrition
>>>>> Wageningen University Biotechnion, Bomenweg 2
>>>>> NL-6703 HD Wageningen
>>>>> the Netherlands
>>>>> tel: (+)31 317 485788
>>>>> fax: (+)31 317 483342
>>>>> internet: http://nutrigene.4t.com <http://nutrigene.4t.com/>
>>>>> email: guido.hooiveld at wur.nl
>>>>>
>>>>>
>>>>>
>>>>> [[alternative HTML version deleted]]
>>>>>
>>>>> _______________________________________________
>>>>> Bioconductor mailing list
>>>>> Bioconductor at stat.math.ethz.ch
>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>> Search the archives:
>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at stat.math.ethz.ch
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>
>>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>> --
>> James W. MacDonald, M.S.
>> Biostatistician
>> Douglas Lab
>> University of Michigan
>> Department of Human Genetics
>> 5912 Buhl
>> 1241 E. Catherine St.
>> Ann Arbor MI 48109-5618
>> 734-615-7826
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>>
>
--
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
More information about the Bioconductor
mailing list