[BioC] Using GenomicRanges with table data
Tom Oates
toates19 at gmail.com
Fri Jan 11 12:54:25 CET 2013
Hi
I am trying to use GenomicRanges as part of an anlalysis of sequencing data.
I have a number of files which I wish to use to make GRanges objects.
For example:
chr1 579578 579804 CpG_12
chr1 630418 630623 CpG_11
chr1 804552 804763 CpG_9
chr1 1307051 1307362 CpG_16
chr1 1323599 1323808 CpG_9
chr1 1350549 1350758 CpG_12
chr1 1403287 1403637 CpG_20
chr1 1418906 1419488 CpG_28
This file is sorted such that chr1 is followed chr2, 3, 4, 5 etc to 20
(as opposed to chr10, 11...19, 2, 3 etc)
I use to make the GRanges object
cpgi_gr<-GRanges(seqnames=Rle(cpgi$V1),
ranges=IRanges(start=cpgi$V2,end=cpgi$V3),
UCSC_AL_ID=cpgi$V4)
but then if I examine
seqnames(cpgi_gr)
I get
factor-Rle of length 89611 with 21 runs
Lengths: 8952 5602 6133 4973 5840 4260 ... 3132 3607 2793
3175 3842 1419
Values : chr1 chr2 chr3 chr4 chr5 chr6 ... chr16 chr17 chr18
chr19 chr20 chrX
Levels(21): chr1 chr10 chr11 chr12 chr13 chr14 chr15 ... chr5 chr6
chr7 chr8 chr9 chrX
So the Values & Levels are not matching. I hope to give the GRanges
object seqlengths of the chr lengths in the genome so I can then
perform flank etc tasks on the data so it is crucial that the values &
lengths match. I imagine that this problem is based around my not
understanding either IRanges or Rle sufficiently but I have read help
on Rle objects & IRanges and can't work out how to ensure that the
formation of the GRanges object leads to the chr values matching
Thanks
More information about the Bioconductor
mailing list