[BioC] Unexpectedly high FDR using contrasts in Limma?

Fri Apr 6 14:33:21 CEST 2007

Dear Jose,

It all looks normal and expected to me.  Are you surprised that you get a significant MBD1
comparison but not DBL-MBD1?  Don't forget that direct comparison are more precision, and hence
more powerful, than indirect comparisons.  Look at the stdev.unscaled from limma and you'll see
that larger values for DBL-MBD1.

Other things like including a dye-effect, background correcting, or otherwise checking the data
often help give more powerful results.

Best wishes
Gordon

> Date: Thu, 05 Apr 2007 19:45:01 +0100
> From: J.delasHeras at ed.ac.uk
> Subject: [BioC] Unexpectedly high FDR using contrasts in Limma?
> To: bioconductor at stat.math.ethz.ch
> Message-ID: <20070405194501.1dq6rz6tlogk40g4 at www.staffmail.ed.ac.uk>
>
>
> Hi everyone,
>
> I'm analysing (limma 2.9.10) a set of twelve 2-colour cDNA arrays (4
> experiments of 3 slides each) using a common reference design. I find
> that I get very high adjusted P values (BH) using contrasts to compare
> some of these experiments.
>
> No adjusted P value is lower than 0.9, which would indicate there's
> not enough evidence that any gene behaves different in either
> experiment, and I find that surprising. Yes, all four experiments are
> quite similar, but there are some genes that do stand out enough (I'd
> think!) and I have confirmation by RT that this is the case.
>
> So I am wondering whether I don't fully understand where these
> adjusted p values come from, or whether I'm not asking the question I
> think I am asking, or maybe an error on my code?
>
> Here's a summary of what I did:
>
> - 12 files read with read.maimages
>
>> targets                    #H226 is the common reference
>     SlideNumber     FileName   Cy3   Cy5
> 1           25 140025-3.gpr  H226  MBD1
> 2           26 140026-3.gpr  MBD1  H226
> 3           23 140023-3.gpr  H226  MBD1
> 4           19 140019-3.gpr  MBD2  H226
> 5           17 140017-3.gpr  H226  MBD2
> 6           15 140015-3.gpr  MBD2  H226
> 7           20 140020-3.gpr  H226 MeCP2
> 8           18 140018-3.gpr MeCP2  H226
> 9           16 140016-3.gpr MeCP2  H226
> 10          22 140022-3.gpr  H226   DBL
> 11          24 140024-3.gpr   DBL  H226
> 12          14 140014-3.gpr   DBL  H226
>
> - background corrected using a custom method: RGList object 'RG.s'
>
>> design                # H226 as common reference
>     DBL MBD1 MBD2 MeCP2
> 1    0    1    0     0
> 2    0   -1    0     0
> 3    0    1    0     0
> 4    0    0   -1     0
> 5    0    0    1     0
> 6    0    0   -1     0
> 7    0    0    0     1
> 8    0    0    0    -1
> 9    0    0    0    -1
> 10   1    0    0     0
> 11  -1    0    0     0
> 12  -1    0    0     0
>
>> MA.sw<-normalizeWithinArrays(RG.s, layout, bc.method="none",
>> method="printtiploess")
>
>> fit.sw<-lmFit(MA.sw, design, method="ls")
>> eb.sw<-eBayes(fit.sw, proportion=0.01)
>
> - I get the adjusted P values from 'topTable':
>
>> tt.sw.MBD1<-topTable(eb.sw, coef=2, n=number, genelist=gal,
>> adjust.method="BH", sort.by="P")
>> tt.sw.MBD2<-topTable(eb.sw, coef=3, n=number, genelist=gal,
>> adjust.method="BH", sort.by="P")
>> tt.sw.MeCP2<-topTable(eb.sw, coef=4, n=number, genelist=gal,
>> adjust.method="BH", sort.by="P")
>> tt.sw.DBL<-topTable(eb.sw, coef=1, n=number, genelist=gal,
>> adjust.method="BH", sort.by="P")
>
> The values I get here look alright and make sense.
>
>  From the 4 experiments, the 4th one (DBL) is a control experiment. I
> would like to compare the other three to it, to see what genes are
> differentially expressed between each of the first three experiments
> and the control. I wanted to use contrasts for this. This is how I did
> it:
>
>> levels<-c("DBL", "MBD1", "MBD2", "MeCP2")
>> contrasts<-makeContrasts(DBL,MBD1,MBD2,MeCP2,DBL-MBD1,DBL-MBD2,DBL-MeCP2,levels=levels)
>
>> contrasts
>         Contrasts
> Levels  DBL MBD1 MBD2 MeCP2 DBL - MBD1 DBL - MBD2 DBL - MeCP2
>    DBL     1    0    0     0          1          1           1
>    MBD1    0    1    0     0         -1          0           0
>    MBD2    0    0    1     0          0         -1           0
>    MeCP2   0    0    0     1          0          0          -1
>
>> fitc<-contrasts.fit(fit.sw,contrasts)
>> ebfitc<-eBayes(fitc)
>
> then I use 'topTable' again to get the adjusted p values etc:
>
>> tt.cMBD1<-topTable(ebfitc, coef=5, n=number, genelist=gal,
>> adjust.method="BH", sort.by="P")
>> tt.cMBD2<-topTable(ebfitc, coef=6, n=number, genelist=gal,
>> adjust.method="BH", sort.by="P")
>> tt.cMeCP2<-topTable(ebfitc, coef=7, n=number, genelist=gal,
>> adjust.method="BH", sort.by="P")
>
> the top of the list makes sense, I picked up the genes I was
> expecting, however the adjusted p values are terrible, so I wonder if
> this is real or I am making a mistake somewhere.
>
> As an example, here's a gene I know to be differentially expressed: CDKN1C
> This gene has negligible expression in the common reference, it has
> negligible expression after the control experiment (DBL) is performed,
> but it is clearly expressed after the MBD1 experiment is performed
> (verified by RT). This is what I get:
>
> A.mean = 10.48 (it goes up to 11.6 in the individual slides of the
> MBD1 experiment, it stays low in all others)
>
> M (MBD1) = 2.98
> P (MBD1) = 6.07e-05
> adj p val (MBD1) = 0.00044
>
> M (DBL) = -0.44
> P (DBL) = 0.38
> adj p val (DBL) = 0.56
>
> M (DBL-MBD1) = -3.43
> P (DBL-MBD1) = 0.00036
> adj p val (DBL-MBD1) = 0.95
>
> I am surprised this gene (and several others like this one) give me
> such poor adjusted p values...
>
> why?
>
> Jose
>
> --
> Dr. Jose I. de las Heras                      Email: J.delasHeras at ed.ac.uk
> The Wellcome Trust Centre for Cell Biology    Phone: +44 (0)131 6513374
> Institute for Cell & Molecular Biology        Fax:   +44 (0)131 6507360
> Swann Building, Mayfield Road
> University of Edinburgh
> Edinburgh EH9 3JR
> UK