[BioC] Is normalization in edgeR required for small RNA sequencing data?

Mark Robinson mark.robinson at imls.uzh.ch
Sat Sep 22 09:54:14 CEST 2012


Hi Daniela,

> Do  I need to normalize my input data using *calcNormFactors() *once I set
> my DGE list or I could  proceed without any normalization? I assume in this
> case that edgeR  performs a default normallization when it  is "calculating
> library sizes from column totals"?

Yes, by default edgeR will use column totals to "normalize".  You don't strictly *need* to do additional normalization -- e.g. by calling calcNormFactors() -- but generally it does no harm and it often helps.  That is, if there are no additional biases (beyond library size) to correct for, these additional correction factors will be near 1 anyways.  As a trivial (uninteresting) example:

> y <- matrix( rnbinom(300, mu=5, size=2), nrow=150 )
> d <- DGEList(y)
Calculating library sizes from column totals.
> d$samples
        group lib.size norm.factors
Sample1     1      720            1
Sample2     1      635            1
> d <- calcNormFactors(d)
> d$samples
        group lib.size norm.factors
Sample1     1      720    0.9663861
Sample2     1      635    1.0347831

Of course, it doesn't hurt to look through a few MA-style plots for your data to see that your samples are comparable and that normalization is operating well.

Best, Mark

----------
Prof. Dr. Mark Robinson
Bioinformatics
Institute of Molecular Life Sciences
University of Zurich
Winterthurerstrasse 190
8057 Zurich
Switzerland

v: +41 44 635 4848
f: +41 44 635 6898
e: mark.robinson at imls.uzh.ch
o: Y11-J-16
w: http://tiny.cc/mrobin

----------
http://www.fgcz.ch/Bioconductor2012



On 22.09.2012, at 00:23, Daniela Lopes Paim Pinto wrote:

> Dear All,
> 
> I am PhD student, currently working on differential expression analysis of
> my smallRNA library deep sequencing data and trying to identify
> differentially expressed miRNAs, using edgeR package. I have 24 different
> samples with 2 biological replicates (48 libraries).  I am performing
> multiple group comparison using GLM method and also Anova-like test to
> idetify DE miRNAs among the different groups of my samples.
> My question is :
> 
> Do  I need to normalize my input data using *calcNormFactors() *once I set
> my DGE list or I could  proceed without any normalization? I assume in this
> case that edgeR  performs a default normallization when it  is "calculating
> library sizes from column totals"?
> 
> 
> I would really appreciate any suggestion on this!
> 
> 
> Thanks in advance,
> 
> 
> Daniela
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list