[BioC] Help with sliding window analysis on GRanges object
Martin Morgan
mtmorgan at fhcrc.org
Sun Mar 9 06:29:01 CET 2014
Hi Stefano --
On 03/08/2014 01:41 PM, Stefano Iantorno wrote:
> Hello
>
>
>
> I am trying to conduct a sliding window analysis on a GRanges object. My
> ranges are a list of 60272 single nucleotide positions representing high
> confidence SNPs stored as IRanges object. I would like to retrieve the
> list of GRanges row IDs for each 500bp window in the genome
> (overlapping windows).
>
>
>
> All the documentation I could find on sliding window functions such as
> runsum, runmean, etc are all for Rle objects.
>
>
>
> Any idea where to start from? I can't figure out a way to pick windows
> in the IRanges object across intervals, since each interval is
> represented by a start and end position (same genomic position since
> it's a single nucleotide long).
>
>
It's hard for me to figure out what you want to do? I guess you've got some SNPs
snps = GRanges("chr1", IRanges(c(1000, 1200, 2000), width=1))
and you'd like to count the number of SNPs in a sliding window of width 500?
You can easily represent your SNPs as an Rle instead of a GRanges
> (cvg <- coverage(snps))
RleList of length 1
$chr1
integer-Rle of length 2000 with 6 runs
Lengths: 999 1 199 1 799 1
Values : 0 1 0 1 0 1
and then calculate a running mean (or runsum) as I guess you've found
> (r <- runmean(cvg, 500))
RleList of length 1
$chr1
numeric-Rle of length 1501 with 6 runs
Lengths: 500 200 300 200 300 1
Values : 0 0.002 0.004 0.002 0 0.002
The toy example could be visualized as
plot(as.numeric(r[[1]]), type="l")
or perhaps ggbio::autoplot(r).
Is that something like what you're looking for? Or you'd like to take this a
step further? Maybe you can construct a simple example like the one here to show
what you're trying to do?
Martin
>
> Any help will be greatly appreciated.
>
>
>
> Thanks
>
>
>
> - Stefano
>
>
>
>
--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M1 B861
Phone: (206) 667-2793
More information about the Bioconductor
mailing list