[BioC] easyRNASeq package: error when summarizing per gene
Johanna Schott [guest]
guest at bioconductor.org
Sun Sep 16 14:21:52 CEST 2012
Dear list,
while analyzing RNASeq data with the easyRNASeq package I get an error message concerning .Call2("Rle_getStartEndRunAndOffset", x, start, end, PACKAGE = "IRanges") :
'x' values larger than vector length 'sum(width)'
Here is my code:
> library("easyRNASeq")
> library(BSgenome.Mmusculus.UCSC.mm9)
> annot <- load("gAnnot.rda")
> count_table <- easyRNASeq(getwd(), filenames = "Sample1.bam",
+ readLength = 52L,
+ organism = "Mmusculus",
+ chr.sizes <- seqlengths(Mmusculus),
+ format = "bam",
+ annotationMethod = "env",
+ annotationObject = exon_range,
+ count = "genes",
+ summarization = "geneModels"
+ )
Here is the error message:
Fehler in unlist(aggregate(readCoverage(obj)[names(geneModel(obj))[gm.sel]], :
Fehler bei der Auswertung des Argumentes 'x' bei der Methodenauswahl
für Funktion 'unlist': Fehler in .Call2("Rle_getStartEndRunAndOffset", x, start, end, PACKAGE = "IRanges") :
'x' values larger than vector length 'sum(width)'
This only happens when I am trying to summarize reads according to genes. For count = "transcripts", everything works fine. Does the problem come from my annotation file?
For the moment, I am only interested in getting annotation to the UCSC RefSeq Track, table refGene. For this purpose, I had to make the annotation object myself, since I could not find any other way to
get an annotation object with gene names and not only transcript names from UCSC. I did this by downloading the table for all exons in the refGene table as custom track and taking the gene names for the individual transcripts from the name2 column
of the refGene table. My annotation object looks like this:
> exon_range
RangedData with 285524 rows and 4 value columns across 32 spaces
space ranges | strand transcript gene exon
<factor> <IRanges> | <character> <character> <character> <character>
1 chr1 [176160756, 176160919] | + NM_011465 Spna1 NM_011465_exon_40_0_chr1_176160757_f
2 chr1 [164626408, 164626494] | - NM_001081290 Prrc2c NM_001081290_exon_15_0_chr1_164626409_r
3 chr1 [ 16121374, 16122631] | + NM_133832 Rdh10 NM_133832_exon_5_0_chr1_16121375_f
4 chr1 [ 21495764, 21495940] | - NM_001160139 Kcnq5 NM_001160139_exon_11_0_chr1_21495765_r
5 chr1 [ 21495764, 21495940] | - NM_023872 Kcnq5 NM_023872_exon_10_0_chr1_21495765_r
6 chr1 [ 23855102, 23855410] | - NM_028534 Smap1 NM_028534_exon_1_0_chr1_23855103_r
7 chr1 [ 26738645, 26742756] | - NM_001033764 4931408C20Rik NM_001033764_exon_0_0_chr1_26738646_r
8 chr1 [ 36568720, 36569998] | + NM_001039551 Cnnm3 NM_001039551_exon_0_0_chr1_36568721_f
9 chr1 [ 36568720, 36569998] | + NM_053186 Cnnm3 NM_053186_exon_0_0_chr1_36568721_f
... ... ... ... ... ... ... ...
285516 chrY_random [52089317, 52089373] | - NM_001037748 LOC380994 NM_001037748_exon_7_0_chrY_random_52089318_r
285517 chrY_random [52515005, 52515028] | + NM_001025241 LOC434960 NM_001025241_exon_0_0_chrY_random_52515006_f
285518 chrY_random [52516256, 52517353] | + NM_001025241 LOC434960 NM_001025241_exon_1_0_chrY_random_52516257_f
285519 chrY_random [52590932, 52591790] | + NM_001160141 LOC100041223 NM_001160141_exon_0_0_chrY_random_52590933_f
285520 chrY_random [52881631, 52882623] | + NM_001160137 LOC100039614 NM_001160137_exon_0_0_chrY_random_52881632_f
285521 chrY_random [53819454, 53820453] | + NM_001160135 LOC100039574 NM_001160135_exon_0_0_chrY_random_53819455_f
285522 chrY_random [54420148, 54420272] | + NM_001017394 LOC100039753 NM_001017394_exon_0_0_chrY_random_54420149_f
285523 chrY_random [54421397, 54423069] | + NM_001017394 LOC100039753 NM_001017394_exon_1_0_chrY_random_54421398_f
285524 chrY_random [58501954, 58502946] | + NM_001160137 LOC100039614 NM_001160137_exon_0_0_chrY_random_58501955_f
Does someone know what this error means, and perhaps what I would have to change in my annotation object to avoid it?
Thank you very much in advance,
Johanna
-- output of sessionInfo():
R version 2.15.1 (2012-06-22)
Platform: x86_64-pc-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 LC_MONETARY=German_Germany.1252 LC_NUMERIC=C LC_TIME=German_Germany.1252
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] BSgenome.Mmusculus.UCSC.mm9_1.3.17 easyRNASeq_1.2.5 ShortRead_1.14.4 latticeExtra_0.6-24
[5] RColorBrewer_1.0-5 lattice_0.20-6 Rsamtools_1.8.6 DESeq_1.8.3
[9] locfit_1.5-8 BSgenome_1.24.0 GenomicRanges_1.8.13 Biostrings_2.24.1
[13] IRanges_1.14.4 edgeR_2.6.12 limma_3.12.3 biomaRt_2.12.0
[17] Biobase_2.16.0 genomeIntervals_1.12.0 BiocGenerics_0.2.0 intervals_0.13.3
loaded via a namespace (and not attached):
[1] annotate_1.34.1 AnnotationDbi_1.18.3 bitops_1.0-4.1 DBI_0.2-5 genefilter_1.38.0 geneplotter_1.34.0 grid_2.15.1
[8] hwriter_1.3 RCurl_1.91-1.1 RSQLite_0.11.1 splines_2.15.1 stats4_2.15.1 survival_2.36-14 XML_3.9-4.1
[15] xtable_1.7-0 zlibbioc_1.2.0
--
Sent via the guest posting facility at bioconductor.org.
More information about the Bioconductor
mailing list