[BioC] LIMMA -- obtaining values used to calculate lfc

Sat Jul 28 05:30:49 CEST 2012

Hi Sam,

On 7/27/12 11:51 AM, Sam McInturf [guest] wrote:
> Hello everyone,
>     I am working on a set of affymetrix arrays, and I am interested in obtaining the A and B values in lfc = log(A/B).
>
> I wrote several small functions to annotate my topTable outputs, and one of these annotations was the exprs() for each probe, but when comparing the lfc's that topTable output, they are very different than the values that I could calculate with the exprs() values.  So I am then lead to believe that the lfc is calculated by expression values that have been adjusted/better estimated by eBayes() and so on.

Your functions must be wrong. If by lfc you mean logFC, or more 
accurately log fold change (being precise with your language always 
helps - when people have to infer what you mean then you may well not 
get your question answered to your satisfaction), then there is no 
adjustment to this quantity by the empirical Bayes step.

Please note here that the empirical Bayes step has to do with 
adjustments made to the denominator of your t-statistic, which estimates 
the intra-group variability, rather than the numerator which estimates 
the log fold change between two samples. There is no adjustment made to 
the numerator.

Here is an example you can try:

 > set.seed(0xabeef)  ## for reproducibility, so you get the exact same 
results
 > mat <- matrix(rnorm(10000), ncol=10)
 > design <- model.matrix(~factor(rep(1:2, each=5)))
 > fit <- lmFit(mat, design)
 > fit2 <- eBayes(fit)
 > topTable(fit2, 2)
         logFC         t      P.Value adj.P.Val          B
541 -2.253898 -3.571210 0.0003773231 0.3773231 -0.9993917
398  2.052834  3.238471 0.0012530516 0.5878160 -1.7256870
335 -1.947496 -3.096086 0.0020317486 0.5878160 -2.0155870
928  1.905416  3.027252 0.0025498760 0.5878160 -2.1512171
556 -1.882673 -2.983539 0.0029390799 0.5878160 -2.2358166
506  1.788323  2.845245 0.0045551320 0.7248347 -2.4955945
874  1.774333  2.810355 0.0050738428 0.7248347 -2.5592404
635 -1.687085 -2.681715 0.0074808935 0.7883527 -2.7872997
257 -1.680194 -2.666920 0.0078151322 0.7883527 -2.8128618
461  1.668451  2.654302 0.0081107165 0.7883527 -2.8345537

Note here that the top fake gene is in row 541 of our fake data, and the 
log fold change is -2.25

 > mean(mat[541,6:10])-mean(mat[541,1:5])
[1] -2.253898

Et voila! Same exact log fold change.

Also note two things - if you are using RMA, and IMO you should be, then 
the data in  your ExpressionSet will be log base 2. Additionally, 
log(A/B) == log(A) - log(B).

Best,

Jim

> So can I take a lfc, A, M, exprs(), et cetera to get the values that the log fold change is directly calculated from?
>
> Sorry if this question has been asked, I have no idea how to describe my problem succinctly.
>
> Cheers!
> Sam McInturf
>
>   -- output of sessionInfo():
>
>> sessionInfo()
> R version 2.15.0 (2012-03-30)
> Platform: i386-pc-mingw32/i386 (32-bit)
>
> locale:
> [1] LC_COLLATE=English_United States.1252
> [2] LC_CTYPE=English_United States.1252
> [3] LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> --
> Sent via the guest posting facility at bioconductor.org.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor