[BioC] findOverlaps method in GenomicRanges not supporting type="equal" for GRangesList, GRangesList?
Hervé Pagès
hpages at fhcrc.org
Thu Nov 21 21:02:38 CET 2013
Hi Michael, Nico,
Right now match/== methods for List objects behave inconsistently.
For example, even for conceptually close objects like IntegerList
and XIntegerViews, we have:
x <- IntegerList(a=1:5, b=2:-3, c=1:3)
v <- successiveViews(unlist(x), elementLengths(x))
> x == rev(x)
LogicalList of length 3
[["a"]] TRUE TRUE TRUE FALSE FALSE
[["b"]] TRUE TRUE TRUE TRUE TRUE TRUE
[["c"]] TRUE TRUE TRUE FALSE FALSE
> v == rev(v)
[1] FALSE TRUE FALSE
> match(x, rev(x))
IntegerList of length 3
[["a"]] 1 2 3 <NA> <NA>
[["b"]] 1 2 3 4 5 6
[["c"]] 1 2 3
> match(v, rev(v))
Error in base::match(x, table, nomatch = nomatch, incomparables =
incomparables, :
'match' requires vector arguments
This is not a good situation and there is still some work that needs to
be done at some point in the future to clean-up the match/== methods in
IRanges/GenomicRanges. In the mean time I think we should hold on
adding new methods for List objects until there is a clear consensus on
how they should behave.
As for Nico's request, I agree that the best way to go would be to just
make findOverlaps(type="equal") work. There are some subtle semantic
differences between a *match* (as reported by match or ==), and equality
from a range overlap point of view. The former can report equality
for ranges on a circular sequence that are not considered equal for
the latter. Another difference is how zero-width ranges are handled.
Thanks,
H.
On 11/21/2013 10:43 AM, Michael Lawrence wrote:
> So I've checked into devel a match,GRangesList,GRangesList. This allows
> findMatches() to return what you want. There is a question though before
> this is approved: does it make sense for match() to act like findOverlaps
> and consider each GRanges atomically (one returned index per GRanges) or
> should match behave as it does other Lists and return an IntegerList, with
> a value per range, grouped by the top-level elements. If we decide on the
> latter, then the method I wrote needs to be removed and the implementation
> moved to the "equals" mode in findOverlaps. Either way,
> findOverlaps(type="equals") should be made to work.
>
> Michael
>
>
> On Thu, Nov 21, 2013 at 8:13 AM, Nicolas Delhomme
> <nicolas.delhomme at umu.se>wrote:
>
>> Thanks!
>> ---------------------------------------------------------------
>> Nicolas Delhomme
>>
>> Nathaniel Street Lab
>> Department of Plant Physiology
>> Umeå Plant Science Center
>>
>> Tel: +46 90 786 7989
>> Email: nicolas.delhomme at plantphys.umu.se
>> SLU - Umeå universitet
>> Umeå S-901 87 Sweden
>> ---------------------------------------------------------------
>>
>> On 21 Nov 2013, at 17:06, Michael Lawrence <lawrence.michael at gene.com>
>> wrote:
>>
>>> I will work on this today.
>>>
>>> Michael
>>>
>>>
>>> On Thu, Nov 21, 2013 at 4:43 AM, Nicolas Delhomme <
>> nicolas.delhomme at umu.se> wrote:
>>> Hej Bioc!
>>>
>>> When I try to find “equal” ranges from two GRangesList object, I get the
>> following error:
>>>
>>>> findOverlaps(query=grng.def,subject=grng.mod,type="equal")
>>> Error in match.arg(type) :
>>> 'arg' should be one of “any”, “start”, “end”, “within”
>>>
>>> Isn’t type=“equal” supported for the GRangesList, GRangesList signature?
>>>
>>> Cheers,
>>>
>>> Nico
>>>
>>> sessionInfo()
>>> R version 3.0.2 (2013-09-25)
>>> Platform: x86_64-apple-darwin13.0.0 (64-bit)
>>>
>>> locale:
>>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>>
>>> attached base packages:
>>> [1] parallel stats graphics grDevices utils datasets methods
>> base
>>>
>>> other attached packages:
>>> [1] easyRNASeq_1.8.2 ShortRead_1.20.0 Rsamtools_1.14.1
>> GenomicRanges_1.14.3 DESeq_1.14.0 lattice_0.20-24
>> locfit_1.5-9.1
>>> [8] Biostrings_2.30.1 XVector_0.2.0 IRanges_1.20.5
>> edgeR_3.4.0 limma_3.18.3 biomaRt_2.18.0
>> Biobase_2.22.0
>>> [15] genomeIntervals_1.18.0 BiocGenerics_0.8.0 intervals_0.14.0
>>>
>>> loaded via a namespace (and not attached):
>>> [1] annotate_1.40.0 AnnotationDbi_1.24.0 bitops_1.0-6
>> DBI_0.2-7 genefilter_1.44.0 geneplotter_1.40.0 grid_3.0.2
>> hwriter_1.3
>>> [9] latticeExtra_0.6-26 LSD_2.5 RColorBrewer_1.0-5
>> RCurl_1.95-4.1 RSQLite_0.11.4 splines_3.0.2 stats4_3.0.2
>> survival_2.37-4
>>> [17] tools_3.0.2 XML_3.98-1.1 xtable_1.7-1
>> zlibbioc_1.8.0
>>>
>>>
>>> ---------------------------------------------------------------
>>> Nicolas Delhomme
>>>
>>> Nathaniel Street Lab
>>> Department of Plant Physiology
>>> Umeå Plant Science Center
>>>
>>> Tel: +46 90 786 7989
>>> Email: nicolas.delhomme at plantphys.umu.se
>>> SLU - Umeå universitet
>>> Umeå S-901 87 Sweden
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>
>>
>
> [[alternative HTML version deleted]]
>
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the Bioconductor
mailing list