[BioC] GenomicFeatures makeTranscriptDbFromBiomart failure

Cook, Malcolm MEC at stowers.org
Thu Jan 12 00:36:11 CET 2012


FWIW: I too just needed to reinstall GenomicFeatures using source to squash the error:

> Make the TranscriptDb object ... Error in callSuper(...) : could not
> find function "initRefFields"

~Malcolm


> -----Original Message-----
> From: bioconductor-bounces at r-project.org [mailto:bioconductor-
> bounces at r-project.org] On Behalf Of Tim Rayner
> Sent: Monday, January 09, 2012 7:59 AM
> To: Hervé Pagès
> Cc: bioconductor at r-project.org
> Subject: Re: [BioC] GenomicFeatures makeTranscriptDbFromBiomart failure
> 
> Hi Hervé,
> 
> Thanks very much for fixing this. I can confirm that GenomicFeatures
> 1.6.5 works on our Linux server. Interestingly, the warnings about
> duplicated levels have also now disappeared in that case.
> 
> I initially ran into a new problem with GenomicFeatures 1.6.6 (see the
> bug report and session info below); however, when I reinstalled
> GenomicFeatures using type='source' the error went away.
> 
> Cheers,
> 
> Tim
> 
> ## Successfully running on Linux:
> > sessionInfo()
> R version 2.14.1 (2011-12-22)
> Platform: x86_64-pc-linux-gnu (64-bit)
> 
> locale:
>  [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C
>  [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8
>  [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8
>  [7] LC_PAPER=C                 LC_NAME=C
>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
> 
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
> 
> other attached packages:
> [1] GenomicFeatures_1.6.5 AnnotationDbi_1.16.10 Biobase_2.14.0
> [4] GenomicRanges_1.6.4   IRanges_1.12.5
> 
> loaded via a namespace (and not attached):
>  [1] biomaRt_2.10.0     Biostrings_2.22.0  BSgenome_1.22.0    DBI_0.2-5
>  [5] RCurl_1.8-0        RSQLite_0.11.1     rtracklayer_1.14.4 tools_2.14.1
>  [9] XML_3.6-2          zlibbioc_1.0.0
> 
> 
> ## Strange problem with the pre-built package on Mac OS X?
> (GenomicFeatures 1.6.6)
> > library(GenomicFeatures)
> Loading required package: IRanges
> 
> Attaching package: 'IRanges'
> 
> The following object(s) are masked from 'package:base':
> 
>     cbind, eval, intersect, Map, mapply, order, paste, pmax, pmax.int,
>     pmin, pmin.int, rbind, rep.int, setdiff, table, union
> 
> Loading required package: GenomicRanges
> Loading required package: AnnotationDbi
> Loading required package: Biobase
> 
> Welcome to Bioconductor
> 
>   Vignettes contain introductory material. To view, type
>   'browseVignettes()'. To cite Bioconductor, see
>   'citation("Biobase")' and for packages 'citation("pkgname")'.
> 
> 
> Attaching package: 'Biobase'
> 
> The following object(s) are masked from 'package:IRanges':
> 
>     updateObject
> 
> Warning message:
> package 'GenomicFeatures' was built under R version 2.14.1
> > makeTranscriptDbFromBiomart(
> biomart=         circ_seqs=       dataset=         transcript_ids=
> > txdb <- makeTranscriptDbFromBiomart(biomart='ensembl',
> dataset='hsapiens_gene_ensembl')
> Download and preprocess the 'transcripts' data frame ... OK
> Download and preprocess the 'chrominfo' data frame ... OK
> Download and preprocess the 'splicings' data frame ... OK
> Download and preprocess the 'genes' data frame ... OK
> Prepare the 'metadata' data frame ... OK
> Make the TranscriptDb object ... Error in callSuper(...) : could not
> find function "initRefFields"	
> > sessionInfo()
> R version 2.14.0 (2011-10-31)
> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
> 
> locale:
> [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
> 
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
> 
> other attached packages:
> [1] GenomicFeatures_1.6.6 AnnotationDbi_1.16.10 Biobase_2.14.0
> [4] GenomicRanges_1.6.4   IRanges_1.12.5
> 
> loaded via a namespace (and not attached):
>  [1] biomaRt_2.10.0     Biostrings_2.22.0  BSgenome_1.22.0    DBI_0.2-5
>  [5] RCurl_1.8-0        RSQLite_0.11.1     rtracklayer_1.14.4 tools_2.14.0
>  [9] XML_3.6-2          zlibbioc_1.0.0
> 
> 2012/1/5 Hervé Pagès <hpages at fhcrc.org>:
> > Hi Tim,
> >
> >
> > On 11/09/2011 10:27 AM, Hervé Pagès wrote:
> >>
> >> Hi,
> >>
> >> On 11-11-09 03:33 AM, Tim Rayner wrote:
> >>>
> >>> Hi Marc,
> >>>
> >>> Thanks very much for looking into this, and also to Michael for
> >>> providing the patch. I've upgraded my GRanges package and the code
> now
> >>> runs with a couple of warnings:
> >>>
> >>>> txdb.Hs2<- makeTranscriptDbFromBiomart(biomart='ensembl',
> >>>> dataset='hsapiens_gene_ensembl')
> >>>
> >>> Download and preprocess the 'transcripts' data frame ... OK
> >>> Download and preprocess the 'chrominfo' data frame ... FAILED! (=>
> >>> skipped)
> >>> Download and preprocess the 'splicings' data frame ... OK
> >>> Download and preprocess the 'genes' data frame ... OK
> >>> Prepare the 'metadata' data frame ... OK
> >>> Make the TranscriptDb object ... OK
> >>> Warning messages:
> >>> 1: In `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
> >>> else paste(labels, :
> >>> duplicated levels will not be allowed in factors anymore
> >>> 2: In `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
> >>> else paste(labels, :
> >>> duplicated levels will not be allowed in factors anymore
> >>> 3: In .normargChrominfo(chrominfo, transcripts$tx_chrom,
> >>> splicings$exon_chrom) :
> >>> chromosome lengths and circularity flags are not available for this
> >>> TranscriptDb object
> >>
> >>
> >> The 2 first warnings + the fact that downloading the chrominfo failed
> >> is not looking good. Didn't use to be like that. We'll investigate on
> >> our side and report later.
> >
> >
> > The problem that was preventing makeTranscriptDbFromBiomart() to
> > fetch the 'chrominfo' data frame (i.e. chromosome lengths) from
> > Ensembl has been fixed. Make sure you update to the latest version
> > of GenomicFeatures (v 1.6.5 in BioC release, v 1.7.8 in BioC
> > devel). Available via biocLite().
> >
> > The warnings about duplicated levels still need to be investigated.
> >
> > Cheers,
> >
> > H.
> >
> >>
> >> Cheers,
> >> H.
> >>
> >>>
> >>> So I think the problem is basically fixed. I wonder if perhaps the
> >>> issue was caused by truncated data transfers; I observed several
> >>> similar failures earlier yesterday afternoon, but in each case the
> >>> problem seemed to occur at a different point in the process.
> >>>
> >>> Thanks again,
> >>>
> >>> Tim
> >>>
> >>> On 8 November 2011 20:16, Marc Carlson<mcarlson at fhcrc.org> wrote:
> >>>>
> >>>> Hi Tim,
> >>>>
> >>>> There was a small bug last week for this method caused by a decision at
> >>>> ensembl to start supporting psuedoautosomal regions, but it was fixed
> >>>> last
> >>>> week and should be fixed with the version of GenomicFeatures
> reported
> >>>> here.
> >>>> I just ran your code locally 4 minutes ago and it still works here. The
> >>>> only difference I can see is that my GRanges package is one version
> >>>> higher
> >>>> than yours (GenomicRanges_1.6.2). Please update that package and
> then
> >>>> run
> >>>> it again and see if you have better luck with ensembl.
> >>>>
> >>>> The patch that Michael mentioned actually arrived at the exact
> moment
> >>>> that I
> >>>> was testing the bug fix above which means that it has a some conflicts I
> >>>> will have to resolve, but it should be added to devel very soon.
> >>>>
> >>>>
> >>>> Marc
> >>>>
> >>>>
> >>>>
> >>>> On 11/08/2011 03:55 AM, Michael Lawrence wrote:
> >>>>>
> >>>>>
> >>>>> On Tue, Nov 8, 2011 at 3:19 AM, Tim Rayner<tfrayner at gmail.com>
> wrote:
> >>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> I'm trying to make a TranscriptDb from the Ensembl human Biomart,
> but
> >>>>>> I've run into a problem. As shown below, the equivalent operation
> for
> >>>>>> the mouse Biomart works fine:
> >>>>>>
> >>>>>>> # Mouse TranscriptDb created without a hitch:
> >>>>>>> txdb.Mm<- makeTranscriptDbFromBiomart(biomart='ensembl',
> >>>>>>
> >>>>>>
> >>>>>> dataset='mmusculus_gene_ensembl')
> >>>>>> Download and preprocess the 'transcripts' data frame ... OK
> >>>>>> Download and preprocess the 'chrominfo' data frame ... OK
> >>>>>> Download and preprocess the 'splicings' data frame ... OK
> >>>>>> Download and preprocess the 'genes' data frame ... OK
> >>>>>> Prepare the 'metadata' data frame ... OK
> >>>>>> Make the TranscriptDb object ... OK
> >>>>>>
> >>>>>>> # Here's the problem:
> >>>>>>> txdb.Hs<- makeTranscriptDbFromBiomart(biomart='ensembl',
> >>>>>>
> >>>>>>
> >>>>>> dataset='hsapiens_gene_ensembl')
> >>>>>> Download and preprocess the 'transcripts' data frame ... OK
> >>>>>> Download and preprocess the 'chrominfo' data frame ... FAILED! (=>
> >>>>>> skipped)
> >>>>>> Download and preprocess the 'splicings' data frame ... Error in
> >>>>>> scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
> >>>>>> line 800380 did not have 11 elements
> >>>>>>
> >>>>>>> sessionInfo()
> >>>>>>
> >>>>>>
> >>>>>> R version 2.14.0 (2011-10-31)
> >>>>>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
> >>>>>>
> >>>>>> locale:
> >>>>>> [1] C
> >>>>>>
> >>>>>> attached base packages:
> >>>>>> [1] stats graphics grDevices utils datasets methods base
> >>>>>>
> >>>>>> other attached packages:
> >>>>>> [1] GenomicFeatures_1.6.1 AnnotationDbi_1.16.0 Biobase_2.14.0
> >>>>>> [4] GenomicRanges_1.6.1 IRanges_1.12.1
> >>>>>>
> >>>>>> loaded via a namespace (and not attached):
> >>>>>> [1] BSgenome_1.22.0 Biostrings_2.22.0 DBI_0.2-5
> >>>>>> RCurl_1.6-10
> >>>>>> [5] RSQLite_0.10.0 XML_3.4-3 biomaRt_2.10.0
> >>>>>> rtracklayer_1.14.1
> >>>>>> [9] tools_2.14.0 zlibbioc_1.0.0
> >>>>>>
> >>>>>> I don't know if this is an issue with the Biomart instance or the
> >>>>>> GenomicFeatures package. I was wondering if anyone had any
> suggestions
> >>>>>> as to how I might work around this?
> >>>>>>
> >>>>>> On a related note, would it be possible to add the ability to point
> >>>>>> makeTranscriptDbFromBiomart() at alternate Biomart hosts (as one
> >>>>>> would, for example, by calling
> >>>>>> biomaRt::useMart(host='www.ensembl.org', ...))?
> >>>>>
> >>>>>
> >>>>> We've submitted a patch that does just this, as well as supporting an
> >>>>> attribute prefix string for selecting alternative gene models.
> >>>>>
> >>>>>
> >>>>>> It would probably be
> >>>>>> good to be able to pass through the 'archive' argument to useMart
> as
> >>>>>> well.
> >>>>>>
> >>>>>> Many thanks,
> >>>>>>
> >>>>>> Tim Rayner
> >>>>>>
> >>>>>> --
> >>>>>> Bioinformatician
> >>>>>> Smith Lab, CIMR
> >>>>>> University of Cambridge
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> Bioconductor mailing list
> >>>>>> Bioconductor at r-project.org
> >>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >>>>>> Search the archives:
> >>>>>>
> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >>>>>>
> >>>>> [[alternative HTML version deleted]]
> >>>>>
> >>>>> _______________________________________________
> >>>>> Bioconductor mailing list
> >>>>> Bioconductor at r-project.org
> >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >>>>> Search the archives:
> >>>>>
> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Bioconductor mailing list
> >>>> Bioconductor at r-project.org
> >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >>>> Search the archives:
> >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >>>>
> >>>
> >>> _______________________________________________
> >>> Bioconductor mailing list
> >>> Bioconductor at r-project.org
> >>> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >>> Search the archives:
> >>> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >>
> >>
> >>
> >
> >
> > --
> > Hervé Pagès
> >
> > Program in Computational Biology
> > Division of Public Health Sciences
> > Fred Hutchinson Cancer Research Center
> > 1100 Fairview Ave. N, M1-B514
> > P.O. Box 19024
> > Seattle, WA 98109-1024
> >
> > E-mail: hpages at fhcrc.org
> > Phone:  (206) 667-5791
> > Fax:    (206) 667-1319
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list