[BioC] Limma, SAM and Fold Change Calculation
    Gordon K Smyth 
    smyth at wehi.EDU.AU
       
    Fri Jul 13 14:44:12 CEST 2007
    
    
  
Dear Peter,
Since I'm the limma author, it will come as no surprise to you that I view the limma fold change
(diff of log-intensities) as the more natural, consistent and useful.
I don't know why SAM computes fold change as it does.  It seems to me somewhat inconsistent,
because SAM uses the limma log-fold-change as numerator in its own test statistic for computing
statistical significant.  So SAM reports a different fold change measure to that which it uses
internally for determining statistical significance.
Best wishes
Gordon
> Date: Thu, 12 Jul 2007 10:26:54 -0400
> From: "Peter White" <pwhite at mail.med.upenn.edu>
> Subject: [BioC] Limma, SAM and Fold Change Calculation
> To: <bioconductor at stat.math.ethz.ch>
>
> I wondered if anyone could comment on what is generally considered the most
> appropriate method of calculating fold change values from Affy data. I have
> a data set from a test vs. control experiment (n=3 in each group) performed
> on the MOE430_2 array. I used the Affy package to read in the CEL file data
> and GCRMA to normalize. Finally I used limma to analyze (lmFit and eBayes).
> Now the FC coefficients I get back from limma appear to be log2(2^(mean(test
> replicates in log2))/(2^mean(control replicates in log2)). For example:
>
>> exprs(eset)[18731,1:6]
>  TA2.CEL  TA5.CEL  TA7.CEL  TA1.CEL  TA8.CEL  TA9.CEL
> 3.294215 4.851795 4.403851 4.782934 7.716893 9.284909
>
>> fit$coefficients[18731,1:2] #returns the mean of the above log2 values
>   Fed_MT   Fed_WT
> 4.183287 7.261578
>
>> fit2$coefficients[18731,1]
> [1] -3.078291
>
>> -1/2^fit2$coefficients[18731,1]
> [1] -8.446136
>
> So we have a probe that is reporting a -8.4 fold downregulation.
>
> The problem is that SAM does not calculate the fold change values in the
> same manner. It appears to take the average of the unlogged data and then
> use those values to calculate fold changes. Thus:
>
>> 2^exprs(eset)[18731,1:6]
>    TA2.CEL    TA5.CEL    TA7.CEL    TA1.CEL    TA8.CEL    TA9.CEL
>   9.809738  28.875927  21.168555  27.530016 210.385700 623.786694
>
>> tmp <- c(mean(2^exprs(eset)[18731,1:3]),mean(2^exprs(eset)[18731,4:6]))
>> tmp
> [1]  19.95141 287.23414
>
>> tmp[1]/tmp[2]
> [1] 0.06946043
>
>> -1/(tmp[1]/tmp[2])
> [1] -14.39669
>
> So now we have -14.4 fold downregulation.
>
> The example given is of course an extreme, but using the limma method there
> are 282 probes with a fold change >2, while using the sam method there are
> only 159, with 141 probes in common. The probes that were called uniquely by
> limma were almost all downregulated (-2 to -5 fold).
>
> MY QUESTION IS WHICH METHOD IS THE CORRECT WAY?
>
> Thanks,
>
> Peter
>
> Peter White, Ph.D.
> Technical Director
> Functional Genomics Core
> Department of Genetics
> University of Pennsylvania
> 570 Clinical Research Building
> 415 Curie Boulevard
> Philadelphia, PA 19104-6145
> ?
> Tel: +1 (215) 898-0773
> Fax: +1 (215) 573-2326
> E-mail: pwhite at mail.med.upenn.edu
> ?
> http://www.med.upenn.edu/pdc/cores_fgc.html
> http://www.med.upenn.edu/kaestnerlab/
> http://www.betacell.org/microarrays
> http://www.cbil.upenn.edu/EPConDB/
    
    
More information about the Bioconductor
mailing list