[BioC] setting contrast in edgeR
Ryan C. Thompson
rct at thompsonclan.org
Sat Jan 5 00:46:54 CET 2013
Hello Alpesh,
If you inspect your design matrix, you should see that it contains an
intercept column (named "(Intercept)" and two contrast columns (named
"grp2" and "grp3"). It looks like you are rather expecting one column
per group. With the design matrix that you specify, the intercept
column represents cond1, and the other two columns represent
"cond2-cond1" and "cond3-cond1" respectively, Therefore your specified
contrast adds up thusly:
-1 * (cond1) + 0.5 * (cond2-cond1) + 0.5 * (cond3 - cond1) = (-cond1) +
0.5 * (cond2 + cond3) - cond1 = (cond2+cond3)/2 - 2 * cond1.
In your artificial case where all conditions are equal, this clearly
adds up to -cond1, not zero, so you will see significant calls. If you
were expecting a design matrix with one column for each condition, you
can get it by including a zero in the call to model.matirx, which will
leave out the "(Intercept)" term, like this:
design <- model.matrix(~0+grp)
Then the contrast that you are using should be correct. Alternatively,
if I understand you correctly, you can use the existing design matrix
with the intercept column and change your contrast to c(0, .5, .5).
Also, if you want "cond1 - mean(cond2, cond3)", I think you will want
to swap the signs of the elements in your contrast. This won't change
the p-values, but it will make the fold changes have the opposite sign.
Please check your work, because I may misremember some of the details
above, but I'm pretty sure the root of your problem is using a design
matrix with an intercept when you expected one without an intercept.
Regards,
-Ryan Thompson
On Fri 04 Jan 2013 02:53:25 PM PST, Alpesh Querer wrote:
> Hi,
>
> I`m trying to find significant DE genes using the contrast (-1,.5,.5).
> I want to test {cond1-mean(cond2,cond3)}
> I created a data-set where all conditions have exact same values, divided
> into 3 conditions and 3 replicates/condition.
>
> Here is the data I`m using.
>
> 150 150 150 150 150 150 150 150 150
> 510 510 510 510 510 510 510 510 510
> 550 550 550 550 550 550 550 550 550
> 500 500 500 500 500 500 500 500 500
> 501 501 501 501 501 501 501 501 501
> 505 505 505 505 505 505 505 505 505
> 50 50 50 50 50 50 50 50 50
>
>
> If the run the following code, I get very significant p-values (in lrt
> table) and all genes are declared to be up-regulated.
> Am I interpreting the output incorrectly or do I need to set up the
> contrast in a different way?
>
> h <- read.delim("test10.txt")
> h <-h[rowSums(h)>0,]
> grp <- factor(c(1,1,1,2,2,2,3,3,3))
> design <- model.matrix(~grp)
> h <- DGEList(counts=h,lib.size=colSums(h),group=grp)
> h <- calcNormFactors(h)
> h <- estimateGLMCommonDisp(h, design, verbose=TRUE)
> h <- estimateGLMTrendedDisp(h, design)
> h <- estimateGLMTagwiseDisp(h, design)
> fit <- glmFit(h, design)
> lrt <- glmLRT(fit,contrast=c(-1,0.5,0.5))
> summary(de <- decideTestsDGE(lrt, p=0.05, adjust="BH"))
>
> [,1]
> -1 0
> 0 0
> 1 6
>
>
>
>
>
>> sessionInfo()
> R version 2.15.2 (2012-10-26)
> Platform: i386-w64-mingw32/i386 (32-bit)
>
> locale:
> [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United
> States.1252 LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C LC_TIME=English_United
> States.1252
>
> attached base packages:
> [1] splines stats graphics grDevices utils datasets methods
> base
>
> other attached packages:
> [1] epicalc_2.15.1.0 nnet_7.3-5
> MASS_7.3-22 survival_2.36-14 foreign_0.8-51
> edgeR_3.0.4
> [7] limma_3.14.3 goseq_1.10.0
> geneLenDataBase_0.99.10 BiasedUrn_1.04 GenomicFeatures_1.10.1
> AnnotationDbi_1.20.3
> [13] Biobase_2.18.0 GenomicRanges_1.10.5
> IRanges_1.16.4 BiocGenerics_0.4.0 BiocInstaller_1.8.3
>
> loaded via a namespace (and not attached):
> [1] biomaRt_2.14.0 Biostrings_2.26.2 bitops_1.0-5
> BSgenome_1.26.1 DBI_0.2-5 grid_2.15.2 lattice_0.20-10
> Matrix_1.0-10
> [9] mgcv_1.7-22 nlme_3.1-105 parallel_2.15.2
> RCurl_1.95-3 Rsamtools_1.10.2 RSQLite_0.11.2 rtracklayer_1.18.1
> stats4_2.15.2
> [17] tools_2.15.2 XML_3.95-0.1 zlibbioc_1.4.0
>
> Thanks,
> Al
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list