[BioC] Selecting elements in GRanges object by element metadata
Michael Muratet
mmuratet at hudsonalpha.org
Wed Jul 11 17:26:28 CEST 2012
On Jul 11, 2012, at 10:21 AM, Kasper Daniel Hansen wrote:
> What you are reporting is true for any (well, there may be exceptions
> I guess) subsetting. Try for example with a standard matrix. The
> solution is to add which(). Contrast
>
>> x = c(1,2,NA)
>> x == 2
> [1] FALSE TRUE NA
>> which(x == 2)
> [1] 2
>
> Kasper
>
Thanks, I should have tried that before. This syntax works:
> tss.annot.gr[na.omit(which(elementMetadata(tss.annot.gr)
$GENE>0)),"GENE"]
GRanges with 3446 ranges and 1 elementMetadata col:
seqnames ranges strand | GENE
<Rle> <IRanges> <Rle> | <numeric>
[1] chr1 [ 4773791, 4776291] - | 1.063973966
[2] chr1 [ 5007460, 5009960] - | 1.668134677
[3] chr1 [16092486, 16094986] - | 1.748685661
[4] chr1 [36737931, 36740431] - | 1.465666717
[5] chr1 [38052053, 38054553] - | 1.750940655
[6] chr1 [38054354, 38056854] + | 1.677518675
[7] chr1 [39592146, 39594646] + | 0.696900841
[8] chr1 [40380974, 40383474] + | 0.777552281
[9] chr1 [40738056, 40740556] + | 0.511665769
Mike
> On Wed, Jul 11, 2012 at 11:09 AM, Michael Muratet
> <mmuratet at hudsonalpha.org> wrote:
>> Greetings
>>
>> I would like to select elements from a GRanges object by testing
>> values in
>> the metadata columns. This seems to work OK:
>>
>> x.gr[which(elementMetadata(x.gr)$fdr<0.05)]
>>
>> So does this, although there's nothing in the documentation about
>> the []
>> operator accepting logical values:
>>
>> fosl2.th17.gr[elementMetadata(fosl2.th17.gr)$fdr<0.05]
>>
>> The problem arises when I try to select from a GRanges object where
>> the
>> metadata columns have NAs:
>>
>>> tss.annot.gr[na.omit(elementMetadata(tss.annot.gr)$GENE>0),"GENE"]
>> GRanges with 4028 ranges and 1 elementMetadata col:
>> seqnames ranges strand | GENE
>> <Rle> <IRanges> <Rle> | <numeric>
>> [1] chr1 [ 3659579, 3662079] - | <NA>
>> [2] chr1 [ 4847394, 4849894] + | 0
>> [3] chr1 [10025979, 10028479] - | <NA>
>> [4] chr1 [17085879, 17088379] - | <NA>
>> [5] chr1 [21067298, 21069798] - | <NA>
>> [6] chr1 [21949662, 21952162] - | 0
>> [7] chr1 [23388014, 23390514] - | <NA>
>> [8] chr1 [23768264, 23770764] + | <NA>
>> [9] chr1 [23927128, 23929628] - | <NA>
>> ... ... ... ... ... ...
>> [4020] chr2 [126607180, 126609680] - | 0
>> [4021] chr2 [127345106, 127347606] - | 0
>> [4022] chr2 [129195132, 129197632] + | -1.223140339
>> [4023] chr2 [129194856, 129197356] - | -1.628782357
>> [4024] chr2 [129360338, 129362838] - | -1.475535653
>> [4025] chr2 [129837609, 129840109] + | 0
>> [4026] chr2 [129948520, 129951020] + | 0
>> [4027] chr2 [140213446, 140215946] - | 0
>> [4028] chr2 [148267271, 148269771] - | -1.564551101
>>
>> The values returned violate the condition. It won't work at all
>> without
>> na.omit.
>>
>> I can coerce the GRanges object to a data.frame, do the selection
>> and create
>> a new GRanges object, but I'm hoping there is a way to do it
>> directly.
>>
>> Am I using the syntax correctly? Is there something peculiar about a
>> DataFrame vs a data.frame that's getting in the way?
>>
>> Thanks
>>
>> Mike
>>
>>
>>
>> Michael Muratet, Ph.D.
>> Senior Scientist
>> HudsonAlpha Institute for Biotechnology
>> mmuratet at hudsonalpha.org
>> (256) 327-0473 (p)
>> (256) 327-0966 (f)
>>
>> Room 4005
>> 601 Genome Way
>> Huntsville, Alabama 35806
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
Michael Muratet, Ph.D.
Senior Scientist
HudsonAlpha Institute for Biotechnology
mmuratet at hudsonalpha.org
(256) 327-0473 (p)
(256) 327-0966 (f)
Room 4005
601 Genome Way
Huntsville, Alabama 35806
More information about the Bioconductor
mailing list