[BioC] Expected number of DE genes?

Jessica Perry Hekman hekman2 at illinois.edu
Wed Jul 16 01:54:28 CEST 2014


Ryan -- that's extremely helpful, thanks! I'll try PCA and look at my 
dispersion estimates and see what light that sheds. Much appreciated.

Jessica

Jessica P. Hekman, DVM, MS
PhD student, University of Illinois, Urbana-Champaign
Animal Sciences / Genetics, Genomics, and Bioinformatics


On 07/15/2014 05:23 PM, Ryan wrote:
> Hi Jessica,
>
> Have you done some exploratory analysis on your dataset? A good place to
> start would be to generate PCA plots using plotMDS for edgeR and plotPCA
> for DESeq2. Do your samples cluster into two groups as expected on the
> PCA plots? Secondly, you should have a look at the dispersion estimates
> from both methods and compare them to typically observed values. You can
> do this with plotBCV for edgeR and plotDispEsts for DESeq2 (but remember
> that BCV is the square root of dispersion, so pay attention to the
> y-axis label). If your dispersions are too high, this indicates that the
> variation within groups is large, which means that detecting significant
> differences between groups is difficult and you will get fewer genes.
> The edgeR User's Guide says in section 2.10 that typical BCV values are
> 0.1 for genetically identical animals (e.g. lab mice) and 0.4 for human
> samples. The latter value is probably closer to what you should expect.
> If your BCV is a lot higher than that, your experiment may not be
> well-controlled, or there may be some other problem in the methods or
> the data that you need to track down.
>
> Home this helps,
>
> -Ryan Thompson.
>
> On Tue Jul 15 14:54:47 2014, Jessica Perry Hekman wrote:
>> I'm getting only a few dozen differentially expressed genes when I
>> analyze my RNA-Seq data with DESeq2 (79) and EdgeR (34) (even fewer
>> when I use EBSeq). I had expected many more -- hundreds or even a
>> thousand. If this is the real answer, I'm fine with it, but I'm
>> concerned that I'm doing something wrong. What are the ranges of
>> numbers of differentially expressed genes that one would expect from
>> DESeq2 or EdgeR?
>>
>> More information:
>>
>> I'm in the midst of my first RNA-seq project (as many of you have
>> probably surmised from my frequent postings to a variety of lists). My
>> initial goal is to get a list of differentially expressed (DE) genes.
>>
>> I have 24 samples, 12 from each of 2 treatment groups.
>>
>> My species is fox (Vulpes vulpes), which aligns very nicely to dog
>> (Canis familiaris).
>>
>> My current approach is to use the dog reference genome (to which my
>> fox reads align at about 83%) + GTF with location of exons.
>>
>> Can I feel confident about DESeq2 and EdgeR's calls?
>>
>> Thanks very much for any insights,
>>
>> Jessica
>>



More information about the Bioconductor mailing list