[BioC] Unexpectedly high FDR using contrasts in Limma?
Gordon K Smyth
smyth at wehi.EDU.AU
Fri Apr 6 14:33:21 CEST 2007
Dear Jose,
It all looks normal and expected to me. Are you surprised that you get a significant MBD1
comparison but not DBL-MBD1? Don't forget that direct comparison are more precision, and hence
more powerful, than indirect comparisons. Look at the stdev.unscaled from limma and you'll see
that larger values for DBL-MBD1.
Other things like including a dye-effect, background correcting, or otherwise checking the data
often help give more powerful results.
Best wishes
Gordon
> Date: Thu, 05 Apr 2007 19:45:01 +0100
> From: J.delasHeras at ed.ac.uk
> Subject: [BioC] Unexpectedly high FDR using contrasts in Limma?
> To: bioconductor at stat.math.ethz.ch
> Message-ID: <20070405194501.1dq6rz6tlogk40g4 at www.staffmail.ed.ac.uk>
>
>
> Hi everyone,
>
> I'm analysing (limma 2.9.10) a set of twelve 2-colour cDNA arrays (4
> experiments of 3 slides each) using a common reference design. I find
> that I get very high adjusted P values (BH) using contrasts to compare
> some of these experiments.
>
> No adjusted P value is lower than 0.9, which would indicate there's
> not enough evidence that any gene behaves different in either
> experiment, and I find that surprising. Yes, all four experiments are
> quite similar, but there are some genes that do stand out enough (I'd
> think!) and I have confirmation by RT that this is the case.
>
> So I am wondering whether I don't fully understand where these
> adjusted p values come from, or whether I'm not asking the question I
> think I am asking, or maybe an error on my code?
>
> Here's a summary of what I did:
>
> - 12 files read with read.maimages
>
>> targets #H226 is the common reference
> SlideNumber FileName Cy3 Cy5
> 1 25 140025-3.gpr H226 MBD1
> 2 26 140026-3.gpr MBD1 H226
> 3 23 140023-3.gpr H226 MBD1
> 4 19 140019-3.gpr MBD2 H226
> 5 17 140017-3.gpr H226 MBD2
> 6 15 140015-3.gpr MBD2 H226
> 7 20 140020-3.gpr H226 MeCP2
> 8 18 140018-3.gpr MeCP2 H226
> 9 16 140016-3.gpr MeCP2 H226
> 10 22 140022-3.gpr H226 DBL
> 11 24 140024-3.gpr DBL H226
> 12 14 140014-3.gpr DBL H226
>
> - background corrected using a custom method: RGList object 'RG.s'
>
>> design # H226 as common reference
> DBL MBD1 MBD2 MeCP2
> 1 0 1 0 0
> 2 0 -1 0 0
> 3 0 1 0 0
> 4 0 0 -1 0
> 5 0 0 1 0
> 6 0 0 -1 0
> 7 0 0 0 1
> 8 0 0 0 -1
> 9 0 0 0 -1
> 10 1 0 0 0
> 11 -1 0 0 0
> 12 -1 0 0 0
>
>> MA.sw<-normalizeWithinArrays(RG.s, layout, bc.method="none",
>> method="printtiploess")
>
>> fit.sw<-lmFit(MA.sw, design, method="ls")
>> eb.sw<-eBayes(fit.sw, proportion=0.01)
>
> - I get the adjusted P values from 'topTable':
>
>> tt.sw.MBD1<-topTable(eb.sw, coef=2, n=number, genelist=gal,
>> adjust.method="BH", sort.by="P")
>> tt.sw.MBD2<-topTable(eb.sw, coef=3, n=number, genelist=gal,
>> adjust.method="BH", sort.by="P")
>> tt.sw.MeCP2<-topTable(eb.sw, coef=4, n=number, genelist=gal,
>> adjust.method="BH", sort.by="P")
>> tt.sw.DBL<-topTable(eb.sw, coef=1, n=number, genelist=gal,
>> adjust.method="BH", sort.by="P")
>
> The values I get here look alright and make sense.
>
> From the 4 experiments, the 4th one (DBL) is a control experiment. I
> would like to compare the other three to it, to see what genes are
> differentially expressed between each of the first three experiments
> and the control. I wanted to use contrasts for this. This is how I did
> it:
>
>> levels<-c("DBL", "MBD1", "MBD2", "MeCP2")
>> contrasts<-makeContrasts(DBL,MBD1,MBD2,MeCP2,DBL-MBD1,DBL-MBD2,DBL-MeCP2,levels=levels)
>
>> contrasts
> Contrasts
> Levels DBL MBD1 MBD2 MeCP2 DBL - MBD1 DBL - MBD2 DBL - MeCP2
> DBL 1 0 0 0 1 1 1
> MBD1 0 1 0 0 -1 0 0
> MBD2 0 0 1 0 0 -1 0
> MeCP2 0 0 0 1 0 0 -1
>
>> fitc<-contrasts.fit(fit.sw,contrasts)
>> ebfitc<-eBayes(fitc)
>
> then I use 'topTable' again to get the adjusted p values etc:
>
>> tt.cMBD1<-topTable(ebfitc, coef=5, n=number, genelist=gal,
>> adjust.method="BH", sort.by="P")
>> tt.cMBD2<-topTable(ebfitc, coef=6, n=number, genelist=gal,
>> adjust.method="BH", sort.by="P")
>> tt.cMeCP2<-topTable(ebfitc, coef=7, n=number, genelist=gal,
>> adjust.method="BH", sort.by="P")
>
> the top of the list makes sense, I picked up the genes I was
> expecting, however the adjusted p values are terrible, so I wonder if
> this is real or I am making a mistake somewhere.
>
> As an example, here's a gene I know to be differentially expressed: CDKN1C
> This gene has negligible expression in the common reference, it has
> negligible expression after the control experiment (DBL) is performed,
> but it is clearly expressed after the MBD1 experiment is performed
> (verified by RT). This is what I get:
>
> A.mean = 10.48 (it goes up to 11.6 in the individual slides of the
> MBD1 experiment, it stays low in all others)
>
> M (MBD1) = 2.98
> P (MBD1) = 6.07e-05
> adj p val (MBD1) = 0.00044
>
> M (DBL) = -0.44
> P (DBL) = 0.38
> adj p val (DBL) = 0.56
>
> M (DBL-MBD1) = -3.43
> P (DBL-MBD1) = 0.00036
> adj p val (DBL-MBD1) = 0.95
>
> I am surprised this gene (and several others like this one) give me
> such poor adjusted p values...
>
> why?
>
> Jose
>
> --
> Dr. Jose I. de las Heras Email: J.delasHeras at ed.ac.uk
> The Wellcome Trust Centre for Cell Biology Phone: +44 (0)131 6513374
> Institute for Cell & Molecular Biology Fax: +44 (0)131 6507360
> Swann Building, Mayfield Road
> University of Edinburgh
> Edinburgh EH9 3JR
> UK
More information about the Bioconductor
mailing list