[R] renameSeqlevels
Teresa Tavella
tavella.teresa23 at gmail.com
Tue Feb 14 15:17:00 CET 2017
Dear all,
I would like to ask if it is possible to change the seqnames of a bam file
giving a vector of character to the function renameSeqlevels. This is
because in order to use the fuction summarizeOverlap or count/find, the
seqnames have to match.
>From the bamfile below I have extracted the locus annotations form the
seqnames (i.e ERCC00002, NC_001133.9...etc) and I have created a list (same
length as the seqlevels of the bam file).
*bamfile*
GAlignments object with 6 alignments and 0 metadata columns:
seqnames
<Rle>
[1]
DQ459430_gene=ERCC00002_loc:ERCC00002|1-1061|+_exons:1-1061_segs:1-1061
[2]
DQ459430_gene=ERCC00002_loc:ERCC00002|1-1061|+_exons:1-1061_segs:1-1061
[3]
DQ459430_gene=ERCC00002_loc:ERCC00002|1-1061|+_exons:1-1061_segs:1-1061
[4]
DQ459430_gene=ERCC00002_loc:ERCC00002|1-1061|+_exons:1-1061_segs:1-1061
[5]
DQ459430_gene=ERCC00002_loc:ERCC00002|1-1061|+_exons:1-1061_segs:1-1061
[6]
DQ459430_gene=ERCC00002_loc:ERCC00002|1-1061|+_exons:1-1061_segs:1-1061
strand cigar qwidth start end width njunc
<Rle> <character> <integer> <integer> <integer> <integer> <integer>
[1] + 8M2D27M 35 1025 1061 37 0
[2] + 8M2D27M 35 1025 1061 37 0
[3] - 36M 36 1025 1060 36 0
[4] - 36M 36 1026 1061 36 0
[5] + 35M 35 1027 1061 35 0
[6] + 35M 35 1027 1061 35 0
-------
*gffile*
GRanges object with 6 ranges and 12 metadata columns:
seqnames ranges strand | source type score
<Rle> <IRanges> <Rle> | <factor> <factor> <numeric>
[1] NC_001133.9 [ 24837, 25070] + | s_cerevisiae exon <NA>
[2] NC_001133.9 [ 25048, 25394] + | s_cerevisiae exon <NA>
[3] NC_001133.9 [ 27155, 27786] + | s_cerevisiae exon <NA>
[4] NC_001133.9 [ 73431, 73792] + | s_cerevisiae exon <NA>
[5] NC_001133.9 [165314, 165561] + | s_cerevisiae exon <NA>
[6] NC_001133.9 [165388, 165781] + | s_cerevisiae exon <NA>
phase gene_id transcript_id exon_number gene_name
<integer> <character> <character> <character> <character>
[1] <NA> XLOC_000040 TCONS_00000191 1 FLO9
[2] <NA> XLOC_000040 TCONS_00000192 1 FLO9
[3] <NA> XLOC_000041 TCONS_00000193 1 FLO9
[4] <NA> XLOC_000055 TCONS_00000200 1 YAL037C-A
[5] <NA> XLOC_000075 TCONS_00000100 1 YAR010C
[6] <NA> XLOC_000075 TCONS_00000219 1 YAR010C
oId nearest_ref class_code
<character> <character> <character>
[1] {TRINITY_GG_normal}16_c1_g1_i1.mrna1 rna8 x
[2] {TRINITY_GG_normal}16_c0_g1_i1.mrna1 rna8 x
[3] {TRINITY_GG_normal}12_c0_g1_i1.mrna1 rna8 x
[4] {TRINITY_GG_normal}3_c3_g1_i1.mrna1 rna31 x
[5] {TRINITY_GG_normal}3479_c0_g1_i1.mrna1 rna77 x
[6] {TRINITY_GG_normal}24_c0_g1_i1.mrna1 rna77 x
tss_id
<character>
[1] TSS42
[2] TSS43
[3] TSS44
[4] TSS71
[5] TSS118
[6] TSS118
-------
It is possible to replace the seqlevels names with the list?
I have tried:
bamfile1 <- renameSeqlevels(seqlevels(bamfile), listx)
Thank you for any advice,
Kind regards,
Teresa
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bamfile.png
Type: image/png
Size: 19163 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20170214/94975aa3/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gfffile.png
Type: image/png
Size: 31020 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20170214/94975aa3/attachment-0001.png>
More information about the R-help
mailing list