[BioC] DiffBind - error in dba.count
Gordon Brown
Gordon.Brown at cancer.org.uk
Fri Sep 14 12:28:42 CEST 2012
Hi, António,
DiffBind doesn't read SAM format, only BAM and (gzipped or uncompressed)
BED. Try converting your SAM files to BAM.
Cheers,
- Gord
P.S. "chr, start, end, score" isn't technically a legal bed file. You'd
need "chr, start, end, NAME, score" for it to be a bed file. Though
DiffBind may read them anyway (I didn't write that part...).
On 2012-09-14 10:58, "António Miguel de Jesus Domingues"
<amjdomingues at gmail.com> wrote:
>Hi again,
>
>I am trying DiffBind and loaded my data that looks like this:
>
>H3K4m3
>4 Samples, 13203 sites in matrix (13792 total):
> ID Tissue Factor Condition Peak.caller Replicate Intervals
>1 wt1 Hela H3K4me3 control1 raw 1 14111
>2 wt2 Hela H3K4me3 control2 raw 2 13771
>3 treat1 Hela H3K4me3 condition1 raw 1 14865
>4 treat2 Hela H3K4me3 condition2 raw 2 13393
>
>But I ran into problems trying to calculate the affinity scores with
>dba.count:
>
>H3K4m3 = dba.count(H3K4m3)
>Error in cond$counts : $ operator is invalid for atomic vectors
>In addition: Warning message:
>In mclapply(arglist, fn, ..., mc.preschedule = FALSE) :
> 6 function calls resulted in an error
>
>The peaks are in bed files (chr, start, end, score) and the reads are in
>SAM format.
>
>Can anyone help me with this?
>
>Cheers.
>António
>
>> sessionInfo()
>R version 2.14.1 (2011-12-22)
>Platform: x86_64-pc-linux-gnu (64-bit)
>
>locale:
>[1] C
>
>attached base packages:
>[1] parallel stats graphics grDevices utils datasets methods
>[8] base
>
>other attached packages:
>[1] DiffBind_1.0.9 Biobase_2.14.0
>
>loaded via a namespace (and not attached):
>[1] IRanges_1.12.6 RColorBrewer_1.0-5 amap_0.8-7
>edgeR_2.4.6
>[5] gdata_2.11.0 gplots_2.11.0 gtools_2.7.0
>limma_3.10.3
>[9] zlibbioc_1.0.1
>>
>
>On 13 September 2012 18:06, António Miguel de Jesus Domingues <
>amjdomingues at gmail.com> wrote:
>
>> Hi all,
>>
>> I am trying to use DiffBind to compare peaks called in control vs
>> condition. I have 2 replicates for each and I've also called peaks
>>using 2
>> different peak callers (to wi, MACS and QuEST). I've also prepared a
>>sample
>> data sheet that looks like this:
>> SampleID Tissue Factor Condition Replicate Peak.caller
>>bamReads
>> bamControl Peaks
>> control Hela TF wt 1
>> MACS path path path
>> control Hela TF wt 1
>> QuEST path path path
>> control2 Hela TF wt 2
>> MACS path path path
>> control 2 Hela TF wt 2
>> QuEST path path path
>> (and the same for the conditions)
>>
>> My plan was to load all the data and then using diffbind selecte a set
>>of
>> common peaks for the peak callers before proceeding with the analysis.
>> However, when I load the data (data =
>>dba(sampleSheet="samplesheet.csv"))
>> the peaks for each caller are not recognized as a different variable.
>>How
>> I can do that and is this silly?
>>
>> I could also derive a set of common peaks independently but it would be
>> neat to do it all with the same package and that seems to be possible
>>but I
>> could not find how to do it in the documentation.
>>
>> Thanks,
>> António
>>
>>
>> --
>> --
>> António Miguel de Jesus Domingues, PhD
>> Neugebauer group
>> Max Planck Institute of Molecular Cell Biology and Genetics, Dresden
>> Pfotenhauerstrasse 108
>> 01307 Dresden
>> Germany
>>
>> e-mail: domingue at mpi-cbg.de
>> tel. +49 351 210 2481
>> The Unbearable Lightness of Molecular Biology
>>
>
>
>
>--
>--
>António Miguel de Jesus Domingues, PhD
>Neugebauer group
>Max Planck Institute of Molecular Cell Biology and Genetics, Dresden
>Pfotenhauerstrasse 108
>01307 Dresden
>Germany
>
>e-mail: domingue at mpi-cbg.de
>tel. +49 351 210 2481
>The Unbearable Lightness of Molecular Biology
>
> [[alternative HTML version deleted]]
>
>
NOTICE AND DISCLAIMER
This e-mail (including any attachments) is intended for ...{{dropped:17}}
More information about the Bioconductor
mailing list