[BioC] Rsamtools: Realloc integer overflow?

Hervé Pagès hpages at fhcrc.org
Tue Jun 4 04:33:08 CEST 2013


Hi Martin,

On 06/03/2013 06:26 PM, Martin Morgan wrote:
> On 06/03/2013 05:27 PM, Michael Lawrence wrote:
>> Hey guys,
>>
>> Whenever I try to calculate the coverage for a BAM file with more than
>> say
>> 500 million reads, I get this error:
>>
>> Error in coverage(readBamGappedAlignments(x, param = param), shift =
>> shift,  : \n  error in evaluating the argument 'x' in selecting a method
>> for function 'coverage': Error in value[[3L]](cond) (from #2) : \n
>> 'Realloc' could not re-allocate memory (18446744065128005632 bytes)\n
>>
>> This looks like integer overflow, possibly within _grow_SCAN_BAM_DATA().
>> Could we just use long there?
>
> I wonder if it would be more sensible if less convenient to do this
> (under Bioc-devel)
>
>    bf <- open(BamFile(fl, yieldSize=100000000))
>    cvg <- coverage(readGAlignmentsFromBam(bf))
>    while (length(aln <- readGAlignmentsFromBam(bf)))
>        cvg <- cvg + coverage(aln)
>    close(bf)
>
> ? It opens the door for better memory management and parallel evaluation.
>
> I'm concerned that using size_t (Realloc casts to this) or ptrdiff_t
> (the size of R long vectors) would only get us through the C code; the
> representation of this in R would require R long vectors, and Rsamtools
> does not (yet?) support that.

Sorry if I'm missing something obvious but why would the representation
of 500 million reads (either as a GappedAlignments object or as a plain
list as returned by scanBam()) require R long vectors?

Thanks,
H.


>
> Martin
>
>>
>> Michael
>>
>>     [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioconductor mailing list