[BioC] DEXSeq estimating dispersions and multiple conditions

Natasha Sahgal nsahgal at well.ox.ac.uk
Mon Jan 28 17:31:11 CET 2013


Dear List,

I am using DEXSeq to test for differential exon usage on a dataset with 3 samples (2 belong to a treatment group and 1 to a control group) and have 2 main questions.

1) Unbalanced design where one out of two groups has no replicates

Name	condition
a1	control
b1	treatment
b2	treatment

Now, I have read in a couple of other threads that there is no point in running DEXSeq when there are no replicates (unlike in case of DESeq - which I have already run for these). However, the researcher still wants me to try it out.

Thus, I was wondering if there in an unbalanced design, is it still possible to run DEXSeq and would estimateDispersions still work? When I run this function, I don't get errors but when I run fitDispersionFunction is doesn't like it.
########
> ecs <- estimateDispersions(ecs, nCores=8)
Estimating Cox-Reid exon dispersion estimates using 8 cores. (Progress report: one dot per 100 genes)
................................................................................................................................................................> 
> 
> 
> ecs <- fitDispersionFunction(ecs)
Error in fitDispersionFunction(ecs) : 
  no CR dispersion estimations found, please first call estimateDispersions function
> 
> head(fData(ecs))
                              geneID exonID testable dispBeforeSharing
ENSG00000000003:E001 ENSG00000000003   E001     TRUE                NA
ENSG00000000003:E002 ENSG00000000003   E002     TRUE                NA
ENSG00000000003:E003 ENSG00000000003   E003     TRUE                NA
ENSG00000000003:E004 ENSG00000000003   E004     TRUE                NA
ENSG00000000003:E005 ENSG00000000003   E005     TRUE                NA
ENSG00000000003:E006 ENSG00000000003   E006     TRUE                NA
                     dispFitted dispersion pvalue padjust chr    start      end
ENSG00000000003:E001         NA         NA     NA      NA   X 99883667 99884983
ENSG00000000003:E002         NA         NA     NA      NA   X 99885756 99885863
ENSG00000000003:E003         NA         NA     NA      NA   X 99887482 99887537
ENSG00000000003:E004         NA         NA     NA      NA   X 99887538 99887565
ENSG00000000003:E005         NA         NA     NA      NA   X 99888402 99888438
ENSG00000000003:E006         NA         NA     NA      NA   X 99888439 99888536
                     strand                                     transcripts
ENSG00000000003:E001      -                                 ENST00000373020
ENSG00000000003:E002      -                                 ENST00000373020
ENSG00000000003:E003      -                                 ENST00000373020
ENSG00000000003:E004      -                 ENST00000373020;ENST00000496771
ENSG00000000003:E005      -                 ENST00000373020;ENST00000496771
ENSG00000000003:E006      - ENST00000494424;ENST00000373020;ENST00000496771 
########

Also, I read in a thread somewhere where Simon suggested to set fData(ecs)$dispersion <- .1, if it is necessary for a 1 vs 1, but does not fully recommend it. Is this what I need to do? 


2) How does one work with multiple groups (or conditions)? 
In the vignette, there is nothing specified and neither can I see anything in the testforDEU function. Unlike DESeq (or edgeR) where one can specify. In DESeq: for e.g. if there are 3 groups: Control, Treatment & Mutant:

res1 <- nbinomTest(cds, "Control", "Treatment")
and/or 
res2 <- nbinomTest(cds, "Control", "Mutant")


Many Thanks,
Natasha

sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=C                 LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
[1] WriteXLS_2.3.0     gdata_2.12.0       DEXSeq_1.4.0       Biobase_2.18.0    
[5] BiocGenerics_0.4.0

loaded via a namespace (and not attached):
[1] biomaRt_2.14.0 gtools_2.7.0   hwriter_1.3    plyr_1.7.1     RCurl_1.95-3  
[6] statmod_1.4.16 stringr_0.6.1  tools_2.15.2   XML_3.95-0.1  



More information about the Bioconductor mailing list