[BioC] DEXSeq estimating dispersions and multiple conditions
Natasha Sahgal
nsahgal at well.ox.ac.uk
Mon Jan 28 17:31:11 CET 2013
Dear List,
I am using DEXSeq to test for differential exon usage on a dataset with 3 samples (2 belong to a treatment group and 1 to a control group) and have 2 main questions.
1) Unbalanced design where one out of two groups has no replicates
Name condition
a1 control
b1 treatment
b2 treatment
Now, I have read in a couple of other threads that there is no point in running DEXSeq when there are no replicates (unlike in case of DESeq - which I have already run for these). However, the researcher still wants me to try it out.
Thus, I was wondering if there in an unbalanced design, is it still possible to run DEXSeq and would estimateDispersions still work? When I run this function, I don't get errors but when I run fitDispersionFunction is doesn't like it.
########
> ecs <- estimateDispersions(ecs, nCores=8)
Estimating Cox-Reid exon dispersion estimates using 8 cores. (Progress report: one dot per 100 genes)
................................................................................................................................................................>
>
>
> ecs <- fitDispersionFunction(ecs)
Error in fitDispersionFunction(ecs) :
no CR dispersion estimations found, please first call estimateDispersions function
>
> head(fData(ecs))
geneID exonID testable dispBeforeSharing
ENSG00000000003:E001 ENSG00000000003 E001 TRUE NA
ENSG00000000003:E002 ENSG00000000003 E002 TRUE NA
ENSG00000000003:E003 ENSG00000000003 E003 TRUE NA
ENSG00000000003:E004 ENSG00000000003 E004 TRUE NA
ENSG00000000003:E005 ENSG00000000003 E005 TRUE NA
ENSG00000000003:E006 ENSG00000000003 E006 TRUE NA
dispFitted dispersion pvalue padjust chr start end
ENSG00000000003:E001 NA NA NA NA X 99883667 99884983
ENSG00000000003:E002 NA NA NA NA X 99885756 99885863
ENSG00000000003:E003 NA NA NA NA X 99887482 99887537
ENSG00000000003:E004 NA NA NA NA X 99887538 99887565
ENSG00000000003:E005 NA NA NA NA X 99888402 99888438
ENSG00000000003:E006 NA NA NA NA X 99888439 99888536
strand transcripts
ENSG00000000003:E001 - ENST00000373020
ENSG00000000003:E002 - ENST00000373020
ENSG00000000003:E003 - ENST00000373020
ENSG00000000003:E004 - ENST00000373020;ENST00000496771
ENSG00000000003:E005 - ENST00000373020;ENST00000496771
ENSG00000000003:E006 - ENST00000494424;ENST00000373020;ENST00000496771
########
Also, I read in a thread somewhere where Simon suggested to set fData(ecs)$dispersion <- .1, if it is necessary for a 1 vs 1, but does not fully recommend it. Is this what I need to do?
2) How does one work with multiple groups (or conditions)?
In the vignette, there is nothing specified and neither can I see anything in the testforDEU function. Unlike DESeq (or edgeR) where one can specify. In DESeq: for e.g. if there are 3 groups: Control, Treatment & Mutant:
res1 <- nbinomTest(cds, "Control", "Treatment")
and/or
res2 <- nbinomTest(cds, "Control", "Mutant")
Many Thanks,
Natasha
sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
[5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] WriteXLS_2.3.0 gdata_2.12.0 DEXSeq_1.4.0 Biobase_2.18.0
[5] BiocGenerics_0.4.0
loaded via a namespace (and not attached):
[1] biomaRt_2.14.0 gtools_2.7.0 hwriter_1.3 plyr_1.7.1 RCurl_1.95-3
[6] statmod_1.4.16 stringr_0.6.1 tools_2.15.2 XML_3.95-0.1
More information about the Bioconductor
mailing list