[BioC] R: is there an identifier that uniquely identifies a	gene all over the many databases ?
    Steve Lianoglou 
    mailinglist.honeypot at gmail.com
       
    Mon Jul 13 01:52:30 CEST 2009
    
    
  
Hi,
> My goal is to get the 3UTR sequence associated to experimentally  
> validated genes.
> Through entering "Human" species and  miRNA identifier "hsa-miR-yyy"  
> TarBase interface returns a
> list of all gene ENSGxxxxxx that have been experimentally tested.
> I input such ENSGxxxxxx identifier to getSequence (BioMat  function)  
> to get the 3UTRr sequence.
> I was surprised to find multiple 3UTR sequences associated to the  
> same ENSGxxxxxx.
> Maybe each transcript is identified by a unique ENSTxxxx  
> identifier... TRUE/FALSE ?
That's likely the case, but you can easily verify this yourself.
Just add "ensembl_transcript_id" (in addition to the ensembl_gene_id  
you already have) as one of the attributes you'd like returned in your  
getBM query to see if that explains the multiple 3_utr_start/end  
results you get.
-steve
--
Steve Lianoglou
Graduate Student: Physiology, Biophysics and Systems Biology
Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos
    
    
More information about the Bioconductor
mailing list