[BioC] LIMMA -- obtaining values used to calculate lfc
James W. MacDonald
jmacdon at uw.edu
Sat Jul 28 05:30:49 CEST 2012
Hi Sam,
On 7/27/12 11:51 AM, Sam McInturf [guest] wrote:
> Hello everyone,
> I am working on a set of affymetrix arrays, and I am interested in obtaining the A and B values in lfc = log(A/B).
>
> I wrote several small functions to annotate my topTable outputs, and one of these annotations was the exprs() for each probe, but when comparing the lfc's that topTable output, they are very different than the values that I could calculate with the exprs() values. So I am then lead to believe that the lfc is calculated by expression values that have been adjusted/better estimated by eBayes() and so on.
Your functions must be wrong. If by lfc you mean logFC, or more
accurately log fold change (being precise with your language always
helps - when people have to infer what you mean then you may well not
get your question answered to your satisfaction), then there is no
adjustment to this quantity by the empirical Bayes step.
Please note here that the empirical Bayes step has to do with
adjustments made to the denominator of your t-statistic, which estimates
the intra-group variability, rather than the numerator which estimates
the log fold change between two samples. There is no adjustment made to
the numerator.
Here is an example you can try:
> set.seed(0xabeef) ## for reproducibility, so you get the exact same
results
> mat <- matrix(rnorm(10000), ncol=10)
> design <- model.matrix(~factor(rep(1:2, each=5)))
> fit <- lmFit(mat, design)
> fit2 <- eBayes(fit)
> topTable(fit2, 2)
logFC t P.Value adj.P.Val B
541 -2.253898 -3.571210 0.0003773231 0.3773231 -0.9993917
398 2.052834 3.238471 0.0012530516 0.5878160 -1.7256870
335 -1.947496 -3.096086 0.0020317486 0.5878160 -2.0155870
928 1.905416 3.027252 0.0025498760 0.5878160 -2.1512171
556 -1.882673 -2.983539 0.0029390799 0.5878160 -2.2358166
506 1.788323 2.845245 0.0045551320 0.7248347 -2.4955945
874 1.774333 2.810355 0.0050738428 0.7248347 -2.5592404
635 -1.687085 -2.681715 0.0074808935 0.7883527 -2.7872997
257 -1.680194 -2.666920 0.0078151322 0.7883527 -2.8128618
461 1.668451 2.654302 0.0081107165 0.7883527 -2.8345537
Note here that the top fake gene is in row 541 of our fake data, and the
log fold change is -2.25
> mean(mat[541,6:10])-mean(mat[541,1:5])
[1] -2.253898
Et voila! Same exact log fold change.
Also note two things - if you are using RMA, and IMO you should be, then
the data in your ExpressionSet will be log base 2. Additionally,
log(A/B) == log(A) - log(B).
Best,
Jim
> So can I take a lfc, A, M, exprs(), et cetera to get the values that the log fold change is directly calculated from?
>
> Sorry if this question has been asked, I have no idea how to describe my problem succinctly.
>
> Cheers!
> Sam McInturf
>
> -- output of sessionInfo():
>
>> sessionInfo()
> R version 2.15.0 (2012-03-30)
> Platform: i386-pc-mingw32/i386 (32-bit)
>
> locale:
> [1] LC_COLLATE=English_United States.1252
> [2] LC_CTYPE=English_United States.1252
> [3] LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> --
> Sent via the guest posting facility at bioconductor.org.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list