[BioC] biomaRt ensembl mmusculus does not contain all ensembl IDs (lincRNA, miRNA etc)?
Steffen Durinck
durinck.steffen at gene.com
Mon Apr 18 22:41:51 CEST 2011
Hi Duke,
It looks like this is a BioMart server issue where the wrong type of
table join is made with the entezgene table.
If you remove the entrezgene attribute you'll get everything back:
> getBM(filters="ensembl_transcript_id", attributes=c("ensembl_transcript_id","ensembl_gene_id","external_transcript_id","refseq_dna"), values=ensTransIDs,mart= mart)
ensembl_transcript_id ensembl_gene_id external_transcript_id refseq_dna
1 ENSMUST00000000001 ENSMUSG00000000001 Gnai3-001 NM_010306
2 ENSMUST00000042585 ENSMUSG00000037982 Gm9725-201
3 ENSMUST00000083463 ENSMUSG00000065397 Mir155-201 NR_029565
We notified the BioMart team of this behavior a while ago and they
would make a change in the next release.
Cheers,
Steffen
On Mon, Apr 18, 2011 at 1:33 PM, Duke <duke.lists at gmx.com> wrote:
> Hi folks,
>
> Following instruction of biomaRt usage, I am trying to get information for
> our mmu data. The code I used was below:
>
> ----------
> library(biomaRt)
> mart<- useDataset("mmusculus_gene_ensembl", useMart("ensembl"))
> ensTransIDs <- c("ENSMUST00000000001",
> "ENSMUST00000083463","ENSMUST00000042585")
> getBM(filters="ensembl_transcript_id",
> attributes=c("ensembl_transcript_id","ensembl_gene_id",
> "external_transcript_id", "external_gene_id", "refseq_dna", "entrezgene"),
> values=ensTransIDs,mart= mart)
> ----------
>
> This code runs fine with some transcript_ids, but for some of others (for
> example, lincRNAs or miRNAs), it gave empty results. For example, the code
> above for one gene, one lincRNA and one miRNA produced result:
>
> ensembl_transcript_id ensembl_gene_id external_transcript_id
> 1 ENSMUST00000000001 ENSMUSG00000000001 Gnai3-001
> external_gene_id refseq_dna entrezgene
> 1 Gnai3 NM_010306 14679
>
>
> => only gene Gnai3 is detected, the other two are not.
>
> Anybody knows what I am doing wrong here, or it is just the database in
> ensembl does not contain all the available transcript_id data?
>
> For the record, here is my sessionInfo():
>
>> sessionInfo()
> R version 2.12.2 (2011-02-25)
> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>
> locale:
> [1] C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] biomaRt_2.6.0
>
> loaded via a namespace (and not attached):
> [1] RCurl_1.4-3 XML_3.2-0 tools_2.12.2
>
> Thanks,
>
> D.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
More information about the Bioconductor
mailing list