[BioC] About ChIPpeakAnno
Zhu, Lihua (Julie)
Julie.Zhu at umassmed.edu
Sat Mar 22 17:12:36 CET 2014
Lucia,
Alternatively, you could combine the result from calling annotatePeakInBatch
twice with two different FeatureLocForDistance options then filter the
result with shortesteDistance column.
NearestTSS <-
annotatePeakInBatch(peakRange,AnnotationData=TSS.mouse.NCBIM37)
NearestGeneEnd <- annotatePeakInBatch(peakRange, FeatureLocForDistance =
"geneEnd", AnnotationData=TSS.mouse.NCBIM37)
NearestGene <- rbind(as.data.frame( NearestTSS) , as.data.frame(
NearestGeneEnd))
Then filter NearestGene to select the gene with smaller shortestDistance for
each peak.
If needed, we could wrap this up and add an option "shortestDistance" in
FeatureLocForDistance. Please let us know. Thanks!
Best regards,
Julie
On 3/20/14 6:26 PM, "Lihua Julie Zhu" <julie.zhu at umassmed.edu> wrote:
> Lucia,
>
> Yes, by default the function returns the gene with the shortest distance from
> peak start to TSS. You could try to set output = "both", maxgap = 5000,
> PeakLocForDistance = "middle" to have the function output all genes that are
> within 5kb away from the middle of the peak. For detailed parameter setting,
> please type help(annotatePeakInBatch) in a R session.
>
> Hope this helps.
>
> Best regards,
>
> Julie
>
>
> On 3/20/14 4:11 PM, "Lucia Peixoto" <luciap at iscb.org> wrote:
>
> Hi Julie,
>
> I have run ChIPpeakAnno without any size constraints to see what happened.
> It seemed to be running fine, but when I went to look at my positive controls
> I realized that it is not annotating all the intragenic peaks as "inside"
>
> For example, I have a peak in
> chr15 89378450 89379100 (mm9) and although it falls inside a gene it assigns
> it as upstream the gene right downstream from it.
>
> Any idea what could be the problem? is it because I am using TSS as annotation
> file and this peak is closer to the TSS os the next gene eventhough it is
> still intragenic? is there anyway to keep this from happening and getting true
> intragenic calls?
>
> Here is my R code:
>
>
> myPeakList<-read.table ("DESonoseq_All.bed")
> peakRange= BED2RangedData(myPeakList)
> annotatedPeak = annotatePeakInBatch(peakRange,
> AnnotationData=TSS.mouse.NCBIM37)
> as.data.frame(annotatedPeak)
>
> thanks
>
> Lucia
>
>
>
>
>
>
> On Fri, Mar 7, 2014 at 2:56 PM, Zhu, Lihua (Julie) <Julie.Zhu at umassmed.edu>
> wrote:
> Lucia,
>
> If you type help(annotatePeakInBatch), you will see that there is a
> parameter "output" with three options. By default, it is set to nearestStart
> which will generate nearest features without any distance constraint. If you
> set "output" to one of the other two options, then the distance cutoff can
> be set by specifying "maxgap", e.g., 5000 as 5kb. Please let me know if this
> answers your questions.
>
> Best regards,
>
> Julie
>
>
> On 3/7/14 2:18 PM, "Lucia Peixoto" <luciap at iscb.org> wrote:
>
>> Hi,
>> This is my first time using the package, so maybe this is a naive question
>> What is the distance cutoff used to find "nearest feature (gene, exon,
>> miRNA,etc)"
>> or there isn't any and I can filter on it after the mapping?
>> thanks
>
>
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list