[BioC] Annotation.db: how automatically call a mapping?
Martin Morgan
mtmorgan at fhcrc.org
Tue Jun 30 18:42:47 CEST 2009
Hooiveld, Guido wrote:
> Hi Martin,
>
> Indeed, another useful, straigh-forward possibility for mapping.
> However, I am now facing the problem of properly combining the
> annotation info with the expression data. This is what I would like to
> do:
>
>> Tab_data <- exprs(eset[probeids])
>> Tab_data <- cbind(Tab_data, fit2$Amean) # to add average expression of
> LIMMA output
>> Tab_data <- cbind(Tab_data, fit2$p.value) # to add p-value of LIMMA
> output
> etc.
>
> This al goes fine, however adding the annotation info 'mixes-up' the
> content of Tab_data; the annotation data replaces the first column of
> Tab_data, and the content of all cells is replaced by 'null'. I suspect
> it has something to do with the type of object I would like to merge,
> but I am not sure.
>
>> map.entrez <- getAnnMap("ENTREZID", annotation(eset))
>> map.entrez <- as.list(map.entrez[probeids])
>
>
>> Tab_data <- cbind(Tab_data, map.entrez)
this cbind's a matrix and a list; check that the mapping between probeid
and entrez id is strictly 1:1, convert to a named vector, and use the
names to coordinately subset & replace
library(annotate)
data(sample.ExpressionSet)
obj <- sample.ExpressionSet # save typing ;)
map <- getAnnMap('ENTREZID', annotation(obj))
submap <- map[featureNames(obj)]
elts <- as.list(submap)
stopifnot(all(sapply(elts, length)) == 1)
tabdat <- as.data.frame(exprs(obj)) # conceptually no longer a matrix
tabdat[names(elts), "ENTREZID"] <- unlist(elts, use.names=FALSE)
if the objective were other than to export data from R, and the data
'SomeData' something experiment specific (like the p.values from limma)
I'd suggest something along the lines of
featureData(obj)[["SomeData", labelDescription="describe SomeData"]]
<- SomeData
to add the data to obj, and to carry it forward in a coordinated fashion
for subsequent analysis, e.g., eventually
forOutput <- cbind(exprs(obj), fData(obj))
(the syntax for simultaneously creating and assigning a _subset_ of
featureData is a little convoluted, featureData(obj)[["...",
labelD...]][indexToCreate] <- values ).
In this case also one wants to make sure the data is appropriately
formatted for standard R operations, e.g., cbinding a matrix / data
frame with a vector, rather than a list.
Martin
> ^ in R this seems to work, but when saved as .txt the content of
> Tab_data is completely mixed up. Before 'adding' map.entrez Tab_dat is
> OK.
>
>
>> write.table(cbind(rownames(Tab_data2), Tab_data2),
> file="test_1234.txt", sep="\t", col.names=TRUE, row.names=FALSE)
>
>> class(Tab_data)
> [1] "matrix"
>> class(map.entrez)
> [1] "list"
>
>
> Do you, or someone elsr, have a suggestion how to properly link these
> two types of data?
> Thanks again,
> Guido
>
>
>
>
>
>> -----Original Message-----
>> From: bioconductor-bounces at stat.math.ethz.ch
>> [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of
>> Martin Morgan
>> Sent: 30 June 2009 00:00
>> To: Hooiveld, Guido
>> Cc: bioconductor at stat.math.ethz.ch
>> Subject: Re: [BioC] Annotation.db: how automatically call a mapping?
>>
>> Hooiveld, Guido wrote:
>>> Hi,
>>>
>>> I am facing a problem i cannot solve myselves, despite everything i
>>> read/know. But i assume the solution is easy for the more
>> knowledgable
>>> folks in BioC/R...
>>>
>>> This does work:
>>>> library(moe430a.db)
>>>> xxyy <- moe430aSYMBOL
>>>> xxyy
>>> SYMBOL map for chip moe430a (object of class "AnnDbBimap")
>>>
>>> However, for this to work you need to know the array type
>> of the data
>>> that is analyzed.
>>>
>>>
>>> Now i would like to automatically extract the (e.g.) SYMBOL mapping
>>> from an annotation.db, thus by retrieving the array type
>> from the eset.
>>>
>>>> library(affy)
>>>> eset <- rma(data)
>>>> probeids <- featureNames(eset)
>>>> annotation(eset)
>>> [1] "moe430a"
>>>
>>> But how can i use this info to properly call the SYMBOL mapping?
>> Hi Guido --
>>
>> to get the appropriate map
>>
>> library(annotate)
>> map = getAnnMap("SYMBOL", annotation(eset))
>>
>> to select just the relevant probes
>>
>> map[probeids]
>>
>> toTable(map[probeids]) or as.list(map[probeids]) might be the
>> next step in the work flow.
>>
>> Martin
>>
>>>
>>> I tried this:
>>>> arraytype <- annotation(eset)
>>>> arraytype <- paste(arraytype, "db", sep = ".") arraytype
>>> [1] "moe430a.db"
>>>> arraytype <- paste("package", arraytype, sep = ":") gh <-
>>>> ls(arraytype) gh
>>> [1] "moe430a" "moe430a_dbconn" "moe430a_dbfile"
>>> "moe430a_dbInfo" "moe430a_dbschema" "moe430aACCNUM"
>>> "moe430aALIAS2PROBE" "moe430aCHR" "moe430aCHRLENGTHS"
>>> "moe430aCHRLOC"
>>> [11] "moe430aCHRLOCEND" "moe430aENSEMBL"
>>> "moe430aENSEMBL2PROBE" "moe430aENTREZID" "moe430aENZYME"
>>> "moe430aENZYME2PROBE" "moe430aGENENAME" "moe430aGO"
>>> "moe430aGO2ALLPROBES" "moe430aGO2PROBE"
>>> [21] "moe430aMAP" "moe430aMAPCOUNTS" "moe430aMGI"
>>> "moe430aMGI2PROBE" "moe430aORGANISM" "moe430aPATH"
>>> "moe430aPATH2PROBE" "moe430aPFAM" "moe430aPMID"
>>> "moe430aPMID2PROBE"
>>> [31] "moe430aPROSITE" "moe430aREFSEQ" "moe430aSYMBOL"
>>> "moe430aUNIGENE" "moe430aUNIPROT"
>>>
>>>> gh[33]
>>> [1] "moe430aSYMBOL"
>>>> symbols <- mget(probeids, gh[33])
>>> Error in mget(probeids, gh[33]) : second argument must be an
>>> environment
>>>
>>> This also doesn't work:
>>>> symbols <- mget(probeids, envir=gh[33])
>>> Error in mget(probeids, envir = gh[33]) :
>>> second argument must be an environment
>>>
>>> My approach thus is the wrong approach to automatically extract
>>> mappings from a annotation.db.
>>> Since i don't know about any other possibility, i would
>> appreciate if
>>> someone could point me to a working solution.
>>>
>>> Thanks,
>>> Guido
>>>
>>>
>>> ------------------------------------------------
>>> Guido Hooiveld, PhD
>>> Nutrition, Metabolism & Genomics Group Division of Human Nutrition
>>> Wageningen University Biotechnion, Bomenweg 2
>>> NL-6703 HD Wageningen
>>> the Netherlands
>>> tel: (+)31 317 485788
>>> fax: (+)31 317 483342
>>> internet: http://nutrigene.4t.com <http://nutrigene.4t.com/>
>>> email: guido.hooiveld at wur.nl
>>>
>>>
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>>
>
More information about the Bioconductor
mailing list