[BioC] [limma] [Rfit] [samr] Gene expression distribution using lmFit and eBayes
Jérôme Lane
jerome.lane at criucpq.ulaval.ca
Sat Nov 23 23:07:43 CET 2013
Dear Gordon,
Thanks for the reply, I appreciated it very much.
Is there a way to determine the « quality » of normalization for genes ?
Best regards,
Jerome
Le 2013-11-22 20:55, « Gordon K Smyth » <smyth at wehi.EDU.AU> a écrit :
>Dear Jerome,
>
>The Shapiro test is only applicable to iid samples, so it is difficult to
>see how it could be used to test normality of expression values in a
>linear modelling context. If you have applied the test to the normalized
>expression values for each gene, then I suspect that the test is actually
>picking up differential expression rather than non-normality.
>
>The limma code is very robust against non-normality. All the usual
>microarray platforms and standard preprocessing procedures produce data
>that is normally distributed to a good enough approximation. Much effort
>has been devoted to developing good preprocessing and normalization
>algorithms.
>
>The concept of "robustness" in statistical analysis goes back a 1953
>paper
>by George Box in Biometrika. In that paper, Box wrote of the "remarkable
>property of robustness to non-normality which [tests for comparing means]
>possess". The tests done by limma inherit the robustness property that
>Box was referring to. Box made the point that the robustness of the two
>sample t-test was not improved by checking first for equal variances. He
>said
>
>"To make the preliminary test on variances is rather like putting to sea
>in a rowing boat to find out whether conditions are sufficiently calm for
>an ocean liner to leave port!"
>
>I rather think that, if Box was still alive today, he might say something
>similar about a preliminary Shapiro test!
>
>Best wishes
>Gordon
>
>> Date: Thu, 21 Nov 2013 17:42:21 -0500
>> From: Jerome Lane <jerome.lane at criucpq.ulaval.ca>
>> To: "bioconductor at stat.math.ethz.ch" <bioconductor at stat.math.ethz.ch>
>> Subject: [BioC] [limma] [Rfit] [samr] Gene expression distribution
>> using lmFit and eBayes
>>
>> Hi,
>>
>> The 3/4 of my microarray gene expressions have non normal
>>distribution with
>> most of p-values after Shapiro test under 10x-5.
>>
>> I tried linear ranked regression from rfit (no normality assumption
>>for
>> residues) from Rfit package for adjustment of covariables + SAM (non
>> parametric) from samr package but results where not as biologically
>>relevant
>> as lmFit + eBayes could provide.
>>
>> I know that lmFit function can analyses gene expression not strictly
>>normal,
>> but what is the limit ?
>>
>> Is it statistically relevant to use lmFit + eBayes according to my
>>data ?
>>
>> Best regards,
>>
>> Jerome Lane
>
>______________________________________________________________________
>The information in this email is confidential and inten...{{dropped:6}}
More information about the Bioconductor
mailing list