[BioC] edgeR and tagwise dispersion: overcorrection for multiple tests?
Gordon K Smyth
smyth at wehi.EDU.AU
Fri Jul 13 08:34:45 CEST 2012
Dear Allessandro,
I haven't seen the MDS plots (because attachments are not distributed to
the list), but don't see anything surprising in what you have reported.
If you compare one group (all C) vs only those members of the other group
that are most different to it (1R+3R), naturally you will find lots of DE
genes.
Best wishes
Gordon
> Date: Thu, 12 Jul 2012 10:48:01 +0200
> From: "alessandro.guffanti at genomnia.com"
> <alessandro.guffanti at genomnia.com>
> To: Bioconductor mailing list <bioconductor at r-project.org>
> Subject: Re: [BioC] edgeR and tagwise dispersion: overcorrection for
> multiple tests?
>
> Dear colleagues good morning - I am back to an old issue because I am
> now much more
> certain of what I see - and I begin to wonder wether this is due to
> biology rather than
> to analytical tools or strategies ..
>
> => Here is my sessionInfo() to begin with:
>
> R version 2.15.0 (2012-03-30)
> Platform: x86_64-pc-mingw32/x64 (64-bit)
>
> locale:
> [1] LC_COLLATE=English_United States.1252
> [2] LC_CTYPE=English_United States.1252
> [3] LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats graphics grDevices datasets utils methods base
>
> other attached packages:
> [1] edgeR_2.6.7 limma_3.12.1 R.utils_1.12.1 R.oo_1.9.8
> [5] R.methodsS3_1.4.2
>
> => the experiment description: RNA from five samples and five controls,
> mice,
> homogenesous stimulus, brain tissue, SAGE with SOLiD with a good mapping
> in the UTR (checked also with genome-wide mapping). Tags have been selected
> with the following parameters: only in UTR; unique mapping; only one
> mismatch;
> begin with CATG, hence quite stringent. Hence tha samples are tagged {1
> to 5}R
> for ths stimulus, {1 to 5} as the control
>
> => MDS plot and simple pairwise regression analysis of the tag counts
> between
> R,C,R vs R and C vs C reveals a clear division of the R samples in two
> groups:
> {1R, 3R} and {2R,4R,5R}. In addition, one C sample (3C) overlaps with
> two R samples
> and is removed from comparisons
>
> => three DEG calculations were performed:
> (A) all C vs all R;
> (B) all C minus 3 C vs 1R + 3R;
> (C) all C minus 3 C versus {2R,4R,5R}
>
> => tagwise dispersion; normalizatuion factor on the libraries
> calculated; filtering by minimal CPM in samples leaves between 6000 and
> 7000 genes for each comparison.
>
> => results which make me wonder about what is happening in the R
> (esperiment) samples:
>
> Comparison A (ALL vs ALL): TWO genes with significant FDR (BH corrected
> PValue I understand)
> Comparison B (ALL-3C vs 1R,3R): 2099 genes with significant FDR (!)
> Comparison C (ALL-3C vs 2R,4R,5R): 20 genes with significant FDR
>
> Now, excuse my ignorance, but this is a rather strong effect of the
> subsetting of one of the two comparison datasets on the FDR, which I did
> not found in many other similar analyses. In fact, when I first mailed
> the list, I was talking about 'overcorrection for multiple tests'.
>
> Is there any reasonable explanation (apart from {1R,3R} and {2R,4R,5R}
> being totally different samples, which I exclude) for this ? maybe a
> strong dependency between the genes involved in the response to the
> stimulus in the two R subgroups ?
>
> I include below the three MDS plots - thanks for any answer and again
> excuse me, maybe there is a trivial reason for this (such as number of
> samples..) but it is an unqiue situation between my many SAGE
> experiments analyzed with edgeR..
>
> Kind regards,
>
> Alessandro
>
> --
>
>
>
>
>
>
>
> --
>
> Alessandro Guffanti - Head, Bioinformatics, Genomnia srl
> Via Nerviano, 31 - 20020 Lainate, Milano, Italy
> Ph: +39-0293305.702 Fax: +39-0293305.777
> http://www.genomnia.com
> "When you're curious, you find lots of interesting things to do."
> (Walt Disney)
>
______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}
More information about the Bioconductor
mailing list