[BioC] Extract junction reads from bam

Tue Aug 19 00:35:26 CEST 2014

Dear Helen,

If you just want to get the number of reads mapping to that junction (not the actual reads), you can try featureCounts function in Rsubread package. It has a parameter called 'countSplitAlignmentsOnly' which allows you to count exon-spanning reads only. It is extremely fast.

Best wishes,
Wei

On Aug 19, 2014, at 6:32 AM, Helen Zhou wrote:

> Dear Sir/Madam,
> 
> I have a bam file with reads from a paired-end RNA-seq experiments, and I would like to extract all reads that span a particular intron/exon junction. For example where part of a read maps to exon ABC starting at chr 1 position 66909348. I do not know where the first part of the read maps; there are multiple possible exons being spliced to exon ABC.
> 
> I can extract all reads mapping to and spanning for example 50nt around the junction, saying:
> library(GenomicRanges)
> loc <-RangesList('1'=IRanges(start=6690298,end=6690398)) 
> parameter <- ScanBamParam(which=loc, what=extract, simpleCigar=FALSE, reverseComplement=FALSE)
> aln <- readGAlignmentsFromBam("test.bam", param=parameter)
> 
> However, this includes all the reads that span across this region because the ends of the read map outside the area indicated in 'loc'. How can I get only the reads where (part of) the read starts mapping at the exact intron/exon location at position 66909348?
> 
> Thanks you
> H Zhou
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:6}}