[BioC] GAGE/Pathview RNA-Seq Workflows: reference genome issue
Luo Weijun
luo_weijun at yahoo.com
Thu Mar 13 03:53:36 CET 2014
Ashesh,
I don’t really have this problem when I run the same code on the demo data. Indeed, your problem is due to the discrepancy in the chrM seqlengths based on error message. That’s why users need to stick to the same version of references genome as the gene annotation (TxDb.Hsapiens.UCSC.hg19.knownGene). Did you use the hg19 indexed by bowtie2? I downloaded it from: ftp://ftp.ccb.jhu.edu/pub/data/bowtie2_indexes/hg19.zip
Try to use that hg19 reference genome, you should fix the problem. Otherwise, try to follow the demo code with demo data, see if you can work that out.
Weijun
--------------------------------------------
On Wed, 3/12/14, Ashesh wrote:
Dear Dr. Luo,
I recently performed some RNA-seq experiments and want to use
the Pathview software to look at changes in biological
pathways. While the instructions on how to use the
software are easy to follow, I have been unable to solve the
following initial error:
>
exByGn <-
exonsBy(TxDb.Hsapiens.UCSC.hg19.knownGene,
"gene")
>
library(Rsamtools)
Loading
required package: Biostrings
>
fls <-
list.files("Cancer_test/",
pattern="bam$", full.names=T)
>
bamfls <-
BamFileList(fls)
>
flag <-
scanBamFlag(isNotPrimaryRead=FALSE,
isProperPair=NA)
>
param <-
ScanBamParam(flag=flag)
>
gnCnt <-
summarizeOverlaps(exByGn, bamfls, mode="Union",
ignore.strand=TRUE, single.end=TRUE,
param=param)
Error
in
mergeNamedAtomicVectors(seqlengths(x), seqlengths(y), what =
c("sequence", :
sequence
chrM has incompatible seqlengths:
- in
'x': 16571
- in
'y': 12069
It seems that the bam files created by tophat (I
also used hg19) and the TxDb.Hsapiens.UCSC.hg19.knownGene
have different seqlengths for chrM. Since, I am a beginner I
have not been able to determine how best to resolve this
error. Should I change the bam file header? Is there a
way to modify the TxDb.Hsapiens.UCSC.hg19.knownGene file?
Is there a work around?
Any help is deeply appreciated.
Sincerely,
Ashesh
More information about the Bioconductor
mailing list