[BioC] limma2annaffy output and heatmap questions

Fri Sep 26 22:47:19 CEST 2008

Hui-Yi Chu wrote:
> Hello Jim,
> 
> 
>>> Thank you for the answer of p.value! and sorry for no codes in mail. The
>>> followings are one example of my data and codes.
>>>
>>> *Example *(partially copied from my result table when coef=1, 1st
>>> probeID):
>>> t-statistic  P-value  Fold-change  mut.a         mut.a        wt1
>>> wt2
>>> 108.84      0           10.28           9.90436     9.88196   10.2249
>>> 10.1021
>>>
>> This isn't possible. You can't have  a fold change of 10.28 for these
>> values. The actual number will be something like -0.2 (The mean of 9.9 and
>> 9.88 minus the mean of 10.22 and 10.1 *cannot* be 10.28).
>>
> 
>         I completely agree with you!  And this is why I don't understand the
> table as well.
>         In addition, I've compared the last four columns of this table with
> my original normalized expression values, they are identical, which means
> there might be something with later steps such as contrast?    Was something
> wrong with my codes??
> 
> 
>> *Codes*:
>>> ## data importing
>>> library("affy")
>>> library("limma")
>>> targets <- readTargets("fed_total.txt")
>>> aaa <- ReadAffy(filenames = targets $ filename)
>>> eset <- rma(aaa)
>>> pData(eset)
>>> samples <- data.frame(genotype = c("wt", "wt", "mut.a", "mut.a","mut.b",
>>> "mut.b"),replicate =c(1,2,1,2,1,2))
>>> varInfo <- data.frame(labelDescription=c("wt, mut.a, mut.b", "arb
>>> number"))
>>> pd <- new("AnnotatedDataFrame", data = samples, varMetadata = varInfo)
>>> phenoData(eset) <- new("AnnotatedDataFrame", data = samples, varMetadata =
>>> varInfo)
>>>
>>> ## filtration
>>> library("genefilter")
>>> f1 <- anyNA
>>> f2 <- pOverA(0.25, log2(100))
>>> ff <- filterfun(f1, f2)
>>> selected <- genefilter(eset, ff)
>>> esetsub <- eset[selected,]
>>>
>>> ## limma fit and contrast
>>> library("limma")
>>> design <- model.matrix(~0+factor(c(1,1,2,2,3,3)))
>>> colnames(design) <- c("wt", "mut.a","mut.b")
>>> fit <- lmFit(ef, design)

What is 'ef'? It appears you aren't using the ExpressionSet from above.

>>> fit2 <- eBayes(fit)
>>> contrast.matrix <- makeContrasts("mut.a vs wt" = mut.a-wt,
>>>                                 "mut.b vs wt" = mut.b-wt,
>>>                                 "Diff" = (mut.a-wt)-(mut.b-wt),
>>>                                 levels=design)
>>>
>> You realize that your "Diff" is mut.a - mut.b, yes?
>>            Yes, but I think there might be a better way to compare them on
>> the basis of wt.

No. (3-1)-(2-1) is completely identical to 3-2.

>>
>>
>>  fit2 <- contrasts.fit (fit, contrast.matrix)
>>> fit3 <- eBayes(fit2)
>>>
>>> ## find DEG and draw heatmap
>>> x <- topTable(fit3, coef=1, adjust="fdr", sort.by="P", number=10000)
>>> y <- x[x$adj.P.Val < 0.01,]
>>> results <- decideTests (fit3, method="separate",
>>>                        adjust.method="BH",p.value=0.01,lfc=1)
>>>
>>>  ## Cluster and heatmaps ------ *draw in three ways, but confused with
>>> results*
>>>         ##1.   heatmap(as.matrix(esetsub[y$ID,]), scale="none")
>>>
>>>           #2.   library("Heatplus")
>>>                  heatmap_2(esetsub[y$ID,]),  scale="none")
>>>           #3.  library(gplots)
>>>                 heatmap.2(esetsub[y$ID,]), scale= "row", trace="none",
>>> density.info="none")
>>>               ## or
>>>                 heatmap.2(esetsub[y$ID,]), scale= "none", trace="none",
>>> density.info="none")
>>>
>>> ## html file output
>>> library("affycoretools")
>>> library("yeast2.db")
>>> annotation(eset) <- "yeast2.db"
>>> limma2annaffy(eset, fit3, design, adjust = "fdr",
>>>        contrast.matrix, annotation(eset), pfilt = 0.01, fldfilt= 1)
>>>
>>>
>>> *Questions*:
>>> I have tons of questions about heatmap, and I sincerely appreciate if
>>> anybody can give me some idea.
>>>       1. what is the scale??  I know there are some discussion in the
>>> maillist, but I am still confused. The result from scale=row is easy to
>>> interpret since we can see some blocks across samples, but how it works??
>>>
>> From ?heatmap.2
>>
>>  scale: character indicating if the values should be centered and
>>          scaled in either the row direction or the column direction,
>>          or none.  The default is '"row"' if 'symm' false, and
>>          '"none"' otherwise.
>>
>> You might look at ?scale as well.
>>
>>        2.  when I change the coef=2, the heatmap result is totally
>>> different, I know the coef=2 is comparing mut.b and wt, so what about the
>>> same group of genes in mut.a?
>>>
>> Not sure what this means. It's a different comparison, so you get different
>> genes. If you want the significant genes from the first comparison, just get
>> the significant genes and extract those from your ExpressionSet.
>>          Thank you for the suggestions!
>>
> 
> 
>>
>>        3. The result of limma contains logFC, AvrExprs, t-statistic, p
>>> value,etc., which value that is used to draw cluster?? logFC or AveExprs??
>>> Some papers their color key display fold change, does that mean I have to
>>> do
>>> something to get the ratio before I draw heatmap?
>>>
>> Again, not sure what you are getting at. You aren't using anything from
>> limma except for the gene names that are being called significant. The data
>> are the expression values from your ExpressionSet.
>>
>>
>>
>>> 4. same question with html table, how does the fold change generate?
>>>
>> It is the difference between the average expression for each group.
>>
>>
>>
>>> Thank you very much!!!!!!!!!!
>>> Hui-Yi
>>>
>>>
>>> *sessionInfo()*
>>> R version 2.7.2 (2008-08-25)
>>> i386-pc-mingw32
>>>
>>> locale:
>>> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
>>> States.1252;LC_MONETARY=English_United
>>> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>>>
>>> attached base packages:
>>> [1] splines   tools     stats     graphics  grDevices utils     datasets
>>> [8] methods   base
>>>
>>> other attached packages:
>>>  [1] yeast2.db_2.2.0      affycoretools_1.12.1 annaffy_1.12.1
>>>  [4] KEGG.db_2.2.0        biomaRt_1.14.1       RCurl_0.9-3
>>>  [7] GOstats_2.6.0        Category_2.6.0       RBGL_1.16.0
>>> [10] annotate_1.18.0      xtable_1.5-3         GO.db_2.2.0
>>> [13] AnnotationDbi_1.2.2  RSQLite_0.7-0        DBI_0.2-4
>>> [16] graph_1.18.1         affyPLM_1.16.0       gcrma_2.12.1
>>> [19] matchprobes_1.12.1   genefilter_1.20.0    survival_2.34-1
>>> [22] yeast2cdf_2.2.0      limma_2.14.6         affy_1.18.2
>>> [25] preprocessCore_1.2.1 affyio_1.8.1         Biobase_2.0.1
>>>
>>> loaded via a namespace (and not attached):
>>> [1] cluster_1.11.11 XML_1.94-0.1
>>>
>>>
>>>
>>> On Fri, Sep 26, 2008 at 8:32 AM, James W. MacDonald
>>> <jmacdon at med.umich.edu>wrote:
>>>
>>>  Hi Hui-Yi,
>>>> Hui-Yi Chu wrote:
>>>>
>>>>  Hi everyone,
>>>>> Apologize first for probably simple question below.
>>>>> Thanks for Jim's help, I've got some html files from limma2annaffy
>>>>> function,
>>>>> however, I am not pretty sure how to interpret the result of "fold
>>>>> change".
>>>>> Please correct me if I am wrong. Based on my knowledge, after limma, for
>>>>> example in coef=1, the fold change represents the contrast of the
>>>>> expression
>>>>> value means from two groups. But the html result made me confused:
>>>>>
>>>>> The header of html is like this:
>>>>> probes   Description   Pubmed   Gene Ontology   Pathway    t-statistic
>>>>> P-value    Fold-change    array1   array1  array2   array2
>>>>>
>>>>>
>>>>> The question is why the fold change is not really the value that (mean
>>>>> of
>>>>> group1)- (mean of group2) ?? If it is because of permutation??
>>>>>
>>>>>  The fold change *is* mean(goup1) - mean(group2). If you show us your
>>>> code
>>>> and a snippet of the output we might be able to help (please see the
>>>> posting
>>>> guide).
>>>>
>>>>  I've read limma user guide and ?limma2annaffycore, so I believe the
>>>> value
>>>>
>>>>> of
>>>>> array 1 and 2 are expression value.
>>>>> By the way, the function of pflt filter genes by p value, but if I want
>>>>> to
>>>>> filter gene by adj.p.val??
>>>>>
>>>>>  I don't know of any function pflt. However, if you mean the argument
>>>> pfilt,
>>>> then it *does* filter by adjusted p-value if in fact you are adjusting
>>>> the
>>>> p-value to begin with.
>>>>
>>>> Best,
>>>>
>>>> Jim
>>>>
>>>>
>>>>
>>>>  Thank you very much!!
>>>>> Hui-Yi
>>>>>
>>>>>       [[alternative HTML version deleted]]
>>>>>
>>>>> _______________________________________________
>>>>> Bioconductor mailing list
>>>>> Bioconductor at stat.math.ethz.ch
>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>> Search the archives:
>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>>
>>>>>  --
>>>> James W. MacDonald, M.S.
>>>> Biostatistician
>>>> Hildebrandt Lab
>>>> 8220D MSRB III
>>>> 1150 W. Medical Center Drive
>>>> Ann Arbor MI 48109-0646
>>>> 734-936-8662
>>>>
>>>>
> 
> Thank you with sincere appreciation!!!
> Hui-Yi
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
Hildebrandt Lab
8220D MSRB III
1150 W. Medical Center Drive
Ann Arbor MI 48109-0646
734-936-8662