[BioC] time course differential analysis - design matrix
James W. MacDonald
jmacdon at uw.edu
Fri Mar 28 14:41:04 CET 2014
Hi Agata,
On 3/28/2014 5:18 AM, Agata [guest] wrote:
> Dear all,
>
> I am doing differential expression analysis and I have a question concerning time course experiments (Single-Channel Experimental Designs).
>
> I have one cell line that was treated in 4 different ways. I want to check which genes respond dierently over time for different treatments. I did 4 different comparisons.
>
> I have treatment A, B, C and D, and I compared groups: A-B, A-C, C-D and B-D. For all my data I created ONE design matrix, and FOUR contrast.diff.matrices. For the fit() function I have used the esetPROC with all my data. This was followed by contrast.fit() and eBayes() functions. At the end I got top differentially expressed genes (from topTableF() function).
>
> Additionally, I did almost the same thing, but I created FOUR different design matrices and FOUR contrast.diff.matrices for all my comparisons. I extracted the subset of esetPROC only with the data I needed for the comparison, and continued as described above.
>
> I got different results for those two approaches. The adj.p.values were much smaller for the first approach than for the second one. I assume it is because of the eBayes function. Could you please explain me which approach is the correct/better one and why?
It's a combination of the eBayes function and the fact that you estimate
gene-wise variances using all the data, rather than a subset (in the
first analysis, that is).
Both methods are technically correct, but as you have already noted, the
first method is more powerful, so in my mind it is to be preferred.
Best,
Jim
>
> Best wishes,
> Agata
>
>
>
> -- output of sessionInfo():
>
> R version 3.0.2 (2013-09-25)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
> [1] C
>
> attached base packages:
> [1] parallel stats graphics grDevices utils datasets methods
> [8] base
>
> other attached packages:
> [1] gplots_2.12.1 lattice_0.20-24 sva_3.8.0
> [4] mgcv_1.7-26 nlme_3.1-111 corpcor_1.6.6
> [7] vsn_3.30.0 marray_1.40.0 hgug4112a.db_2.10.1
> [10] org.Hs.eg.db_2.10.1 Agi4x44PreProcess_1.22.0 genefilter_1.44.0
> [13] annotate_1.40.0 AgiMicroRna_2.12.0 affycoretools_1.34.0
> [16] KEGG.db_2.10.1 GO.db_2.10.1 RSQLite_0.11.4
> [19] DBI_0.2-7 AnnotationDbi_1.24.0 preprocessCore_1.24.0
> [22] affy_1.40.0 Biobase_2.22.0 BiocGenerics_0.8.0
> [25] biomaRt_2.18.0 limma_3.18.12 WriteXLS_3.4.0
>
> loaded via a namespace (and not attached):
> [1] AnnotationForge_1.4.4 BSgenome_1.30.0 BiocInstaller_1.12.0
> [4] Biostrings_2.30.1 Category_2.28.0 DESeq2_1.2.10
> [7] Formula_1.1-1 GOstats_2.28.0 GSEABase_1.24.0
> [10] GenomicFeatures_1.14.2 GenomicRanges_1.14.4 Hmisc_3.14-0
> [13] IRanges_1.20.6 KernSmooth_2.23-10 MASS_7.3-29
> [16] Matrix_1.1-2 PFAM.db_2.10.1 R.methodsS3_1.6.1
> [19] R.oo_1.17.0 R.utils_1.29.8 R2HTML_2.2.1
> [22] RBGL_1.38.0 RColorBrewer_1.0-5 RCurl_1.95-4.1
> [25] Rcpp_0.11.0 RcppArmadillo_0.4.000.2 ReportingTools_2.2.0
> [28] Rsamtools_1.14.3 VariantAnnotation_1.8.12 XML_3.98-1.1
> [31] XVector_0.2.0 affyio_1.30.0 annaffy_1.34.0
> [34] biovizBase_1.10.7 bit_1.1-11 bitops_1.0-6
> [37] caTools_1.16 cluster_1.14.4 codetools_0.2-8
> [40] colorspace_1.2-4 dichromat_2.0-0 digest_0.6.4
> [43] edgeR_3.4.2 ff_2.2-12 foreach_1.4.1
> [46] gcrma_2.34.0 gdata_2.13.2 ggbio_1.10.11
> [49] ggplot2_0.9.3.1 graph_1.40.1 grid_3.0.2
> [52] gridExtra_0.9.1 gtable_0.1.2 gtools_3.3.0
> [55] hwriter_1.3 iterators_1.0.6 labeling_0.2
> [58] latticeExtra_0.6-26 locfit_1.5-9.1 munsell_0.4.2
> [61] oligoClasses_1.24.0 plyr_1.8 proto_0.3-10
> [64] reshape2_1.2.2 rtracklayer_1.22.3 scales_0.2.3
> [67] splines_3.0.2 stats4_3.0.2 stringr_0.6.2
> [70] survival_2.37-7 tools_3.0.2 xtable_1.7-1
> [73] zlibbioc_1.8.0
>
>
> packageDescription('limma')$Maintainer
> [1] "Gordon Smyth <smyth at wehi.edu.au>"
>
> --
> Sent via the guest posting facility at bioconductor.org.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099
More information about the Bioconductor
mailing list