[BioC] Limma : Single Channel experiment design matrix
James W. MacDonald
jmacdon at uw.edu
Fri Mar 7 16:40:35 CET 2014
Hi Koran,
On 3/7/2014 3:49 AM, Koran [guest] wrote:
> Dear All,
>
> I have a question regarding the way to analyse single channel experiment (several groups).
>
> In a first approach, I followed the limma user's guide for several groups (chapter 9.3), and used a contrast
> matrix to make the comparison between two groups among all groups.
>
> I also followed another approach : I take a sub expression set with only the two groups of samples I need to compare, and then follow the two groups approach (chapter 9.2)
>
> If fold change remains the same, the p.value of moderated t-test is different :
>
> for the "chapter 9.3" I get this (topTable):
> logFC AveExpr t P.Value adj.P.Val B
> NM_013409 4.804450 9.351186 63.46856 5.198462e-32 2.225306e-27 60.42083
> NM_170685 3.327586 7.476924 43.29198 2.292074e-27 4.102931e-23 51.64301
> NM_021995 3.598441 8.731876 42.94068 2.875416e-27 4.102931e-23 51.44328
> NM_000014 2.686684 11.968353 38.61755 5.481149e-26 4.817512e-22 48.80565
> NM_001747 2.727227 8.834094 38.33543 6.716748e-26 4.817512e-22 48.62109
>
> for the "chapter 9.2", I get this topTable :
> logFC AveExpr t P.Value adj.P.Val B
> NM_013409 4.804450 10.238329 70.14768 7.077519e-15 2.709195e-10 23.07593
> NM_015464 3.868533 9.850459 66.20398 1.265772e-14 2.709195e-10 22.72371
> NM_000119 -3.322662 11.608264 -61.31983 2.733108e-14 3.899871e-10 22.22951
> BC025320 2.908061 7.112412 56.61705 6.089619e-14 6.516958e-10 21.68233
> NM_000014 2.686684 11.682645 53.85715 1.005598e-13 8.609327e-10 21.32326
> NM_170685 3.327586 7.826983 51.22412 1.662803e-13 1.086579e-09 20.95091
>
>
> Of course, logFC remains the same, Avg Expression are obviously differents, but the p.value are differents.
> So I was wondering why ? and wich is the best approach to choose since one give results with more statistical power ?
The difference between the two models has to do primarily with the
measure of intra-group variability, which is used to construct the
denominator of your t-statistic. This measure is a pooled estimate,
based on all samples in the model. All else equal, increasing the number
of samples used to estimate variance tends to make the estimate smaller
(and arguably more accurate). Since you are thus shrinking your
denominator, the statistic gets larger and you get smaller p-values.
As a general rule I would think fitting the first model would be the
preferred way to go.
Best,
Jim
>
> Thank you for your kind answers.
>
> Koran
>
>
>
>
>
>
>
>
>
>
>
>
> -- output of sessionInfo():
>
> R version 3.0.2 (2013-09-25)
> Platform: x86_64-apple-darwin10.8.0 (64-bit)
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] stats graphics grDevices datasets utils methods base
>
> other attached packages:
> [1] RColorBrewer_1.0-5 R.basic_0.53.0 R.utils_1.29.8 R.oo_1.18.0 R.methodsS3_1.6.1
> [6] plotrix_3.5-3 multicore_0.1-7 pvclust_1.2-2 arrayQualityMetrics_3.18.0 impute_1.36.0
> [11] marray_1.40.0 limma_3.18.13 fortunes_1.5-2 snowfall_1.84-6 snow_0.3-13
>
> loaded via a namespace (and not attached):
> [1] affy_1.40.0 affyio_1.30.0 affyPLM_1.38.0 annotate_1.40.1 AnnotationDbi_1.24.0 beadarray_2.12.0
> [7] BeadDataPackR_1.14.0 Biobase_2.22.0 BiocGenerics_0.8.0 BiocInstaller_1.12.0 Biostrings_2.30.1 Cairo_1.5-5
> [13] cluster_1.14.4 colorspace_1.2-4 DBI_0.2-7 Formula_1.1-1 gcrma_2.34.0 genefilter_1.44.0
> [19] grid_3.0.2 Hmisc_3.14-2 hwriter_1.3 IRanges_1.20.6 KernSmooth_2.23-10 lattice_0.20-27
> [25] latticeExtra_0.6-26 parallel_3.0.2 plyr_1.8.1 preprocessCore_1.24.0 Rcpp_0.11.0 reshape2_1.2.2
> [31] RSQLite_0.11.4 setRNG_2011.11-2 splines_3.0.2 stats4_3.0.2 stringr_0.6.2 survival_2.37-7
> [37] SVGAnnotation_0.93-1 tools_3.0.2 vsn_3.30.0 XML_3.95-0.2 xtable_1.7-1 XVector_0.2.0
> [43] zlibbioc_1.8.0
>
>
> --
> Sent via the guest posting facility at bioconductor.org.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099
More information about the Bioconductor
mailing list