[BioC] limma2annaffy output and heatmap questions
Wang, Jixin
jixinwang at neo.tamu.edu
Sun Sep 28 05:23:25 CEST 2008
Hi Hui-Yi,
I have a question about the heat map. Which value do you use to draw? Raw fluorescence intensity, normalized fluorescence intensity, or normalized log-ratio?
Many thanks,
Jixin
----- Original Message -----
From: "Hui-Yi Chu" <huiyi.chu at gmail.com>
To: "James W. MacDonald" <jmacdon at med.umich.edu>
Cc: bioconductor at stat.math.ethz.ch
Sent: Saturday, September 27, 2008 9:50:16 PM GMT -06:00 US/Canada Central
Subject: Re: [BioC] limma2annaffy output and heatmap questions
Hi everyone,
Just wanna say that I found the reason why I got the odd result of fold
change from limma2annaffy function:
Since I've subsetted my eset as esetsub, the arg of limma2annaffy must be
"esetsub" rather than "eset".
Thank you everyone and cheers,
Hui-Yi
On Fri, Sep 26, 2008 at 5:02 PM, Hui-Yi Chu <huiyi.chu at gmail.com> wrote:
> Here is the code, I may forget to paste it.
>
> esetsub <- eset[selected,]
> ef <- exprs(esetsub)
>
>
>
>
> On Fri, Sep 26, 2008 at 4:47 PM, James W. MacDonald <jmacdon at med.umich.edu
> > wrote:
>
>>
>>
>> Hui-Yi Chu wrote:
>>
>>> Hello Jim,
>>>
>>>
>>> Thank you for the answer of p.value! and sorry for no codes in mail. The
>>>>> followings are one example of my data and codes.
>>>>>
>>>>> *Example *(partially copied from my result table when coef=1, 1st
>>>>> probeID):
>>>>> t-statistic P-value Fold-change mut.a mut.a wt1
>>>>> wt2
>>>>> 108.84 0 10.28 9.90436 9.88196 10.2249
>>>>> 10.1021
>>>>>
>>>>> This isn't possible. You can't have a fold change of 10.28 for these
>>>> values. The actual number will be something like -0.2 (The mean of 9.9
>>>> and
>>>> 9.88 minus the mean of 10.22 and 10.1 *cannot* be 10.28).
>>>>
>>>>
>>> I completely agree with you! And this is why I don't understand
>>> the
>>> table as well.
>>> In addition, I've compared the last four columns of this table
>>> with
>>> my original normalized expression values, they are identical, which means
>>> there might be something with later steps such as contrast? Was
>>> something
>>> wrong with my codes??
>>>
>>>
>>> *Codes*:
>>>>
>>>>> ## data importing
>>>>> library("affy")
>>>>> library("limma")
>>>>> targets <- readTargets("fed_total.txt")
>>>>> aaa <- ReadAffy(filenames = targets $ filename)
>>>>> eset <- rma(aaa)
>>>>> pData(eset)
>>>>> samples <- data.frame(genotype = c("wt", "wt", "mut.a",
>>>>> "mut.a","mut.b",
>>>>> "mut.b"),replicate =c(1,2,1,2,1,2))
>>>>> varInfo <- data.frame(labelDescription=c("wt, mut.a, mut.b", "arb
>>>>> number"))
>>>>> pd <- new("AnnotatedDataFrame", data = samples, varMetadata = varInfo)
>>>>> phenoData(eset) <- new("AnnotatedDataFrame", data = samples,
>>>>> varMetadata =
>>>>> varInfo)
>>>>>
>>>>> ## filtration
>>>>> library("genefilter")
>>>>> f1 <- anyNA
>>>>> f2 <- pOverA(0.25, log2(100))
>>>>> ff <- filterfun(f1, f2)
>>>>> selected <- genefilter(eset, ff)
>>>>> esetsub <- eset[selected,]
>>>>>
>>>>> ## limma fit and contrast
>>>>> library("limma")
>>>>> design <- model.matrix(~0+factor(c(1,1,2,2,3,3)))
>>>>> colnames(design) <- c("wt", "mut.a","mut.b")
>>>>> fit <- lmFit(ef, design)
>>>>>
>>>>
>> What is 'ef'? It appears you aren't using the ExpressionSet from above.
>>
>> fit2 <- eBayes(fit)
>>>>> contrast.matrix <- makeContrasts("mut.a vs wt" = mut.a-wt,
>>>>> "mut.b vs wt" = mut.b-wt,
>>>>> "Diff" = (mut.a-wt)-(mut.b-wt),
>>>>> levels=design)
>>>>>
>>>>> You realize that your "Diff" is mut.a - mut.b, yes?
>>>> Yes, but I think there might be a better way to compare them
>>>> on
>>>> the basis of wt.
>>>>
>>>
>> No. (3-1)-(2-1) is completely identical to 3-2.
>> Is there any better method to get the idea above?
>>
>>
>>
>>>>
>>>> fit2 <- contrasts.fit (fit, contrast.matrix)
>>>>
>>>>> fit3 <- eBayes(fit2)
>>>>>
>>>>> ## find DEG and draw heatmap
>>>>> x <- topTable(fit3, coef=1, adjust="fdr", sort.by="P", number=10000)
>>>>> y <- x[x$adj.P.Val < 0.01,]
>>>>> results <- decideTests (fit3, method="separate",
>>>>> adjust.method="BH",p.value=0.01,lfc=1)
>>>>>
>>>>> ## Cluster and heatmaps ------ *draw in three ways, but confused with
>>>>> results*
>>>>> ##1. heatmap(as.matrix(esetsub[y$ID,]), scale="none")
>>>>>
>>>>> #2. library("Heatplus")
>>>>> heatmap_2(esetsub[y$ID,]), scale="none")
>>>>> #3. library(gplots)
>>>>> heatmap.2(esetsub[y$ID,]), scale= "row", trace="none",
>>>>> density.info="none")
>>>>> ## or
>>>>> heatmap.2(esetsub[y$ID,]), scale= "none", trace="none",
>>>>> density.info="none")
>>>>>
>>>>> ## html file output
>>>>> library("affycoretools")
>>>>> library("yeast2.db")
>>>>> annotation(eset) <- "yeast2.db"
>>>>> limma2annaffy(eset, fit3, design, adjust = "fdr",
>>>>> contrast.matrix, annotation(eset), pfilt = 0.01, fldfilt= 1)
>>>>>
>>>>>
>>>>> *Questions*:
>>>>> I have tons of questions about heatmap, and I sincerely appreciate if
>>>>> anybody can give me some idea.
>>>>> 1. what is the scale?? I know there are some discussion in the
>>>>> maillist, but I am still confused. The result from scale=row is easy to
>>>>> interpret since we can see some blocks across samples, but how it
>>>>> works??
>>>>>
>>>>> From ?heatmap.2
>>>>
>>>> scale: character indicating if the values should be centered and
>>>> scaled in either the row direction or the column direction,
>>>> or none. The default is '"row"' if 'symm' false, and
>>>> '"none"' otherwise.
>>>>
>>>> You might look at ?scale as well.
>>>>
>>>> 2. when I change the coef=2, the heatmap result is totally
>>>>
>>>>> different, I know the coef=2 is comparing mut.b and wt, so what about
>>>>> the
>>>>> same group of genes in mut.a?
>>>>>
>>>>> Not sure what this means. It's a different comparison, so you get
>>>> different
>>>> genes. If you want the significant genes from the first comparison, just
>>>> get
>>>> the significant genes and extract those from your ExpressionSet.
>>>> Thank you for the suggestions!
>>>>
>>>>
>>>
>>>
>>>> 3. The result of limma contains logFC, AvrExprs, t-statistic, p
>>>>
>>>>> value,etc., which value that is used to draw cluster?? logFC or
>>>>> AveExprs??
>>>>> Some papers their color key display fold change, does that mean I have
>>>>> to
>>>>> do
>>>>> something to get the ratio before I draw heatmap?
>>>>>
>>>>> Again, not sure what you are getting at. You aren't using anything
>>>> from
>>>> limma except for the gene names that are being called significant. The
>>>> data
>>>> are the expression values from your ExpressionSet.
>>>>
>>>>
>>>>
>>>> 4. same question with html table, how does the fold change generate?
>>>>>
>>>>> It is the difference between the average expression for each group.
>>>>
>>>>
>>>>
>>>> Thank you very much!!!!!!!!!!
>>>>> Hui-Yi
>>>>>
>>>>>
>>>>> *sessionInfo()*
>>>>> R version 2.7.2 (2008-08-25)
>>>>> i386-pc-mingw32
>>>>>
>>>>> locale:
>>>>> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
>>>>> States.1252;LC_MONETARY=English_United
>>>>> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>>>>>
>>>>> attached base packages:
>>>>> [1] splines tools stats graphics grDevices utils
>>>>> datasets
>>>>> [8] methods base
>>>>>
>>>>> other attached packages:
>>>>> [1] yeast2.db_2.2.0 affycoretools_1.12.1 annaffy_1.12.1
>>>>> [4] KEGG.db_2.2.0 biomaRt_1.14.1 RCurl_0.9-3
>>>>> [7] GOstats_2.6.0 Category_2.6.0 RBGL_1.16.0
>>>>> [10] annotate_1.18.0 xtable_1.5-3 GO.db_2.2.0
>>>>> [13] AnnotationDbi_1.2.2 RSQLite_0.7-0 DBI_0.2-4
>>>>> [16] graph_1.18.1 affyPLM_1.16.0 gcrma_2.12.1
>>>>> [19] matchprobes_1.12.1 genefilter_1.20.0 survival_2.34-1
>>>>> [22] yeast2cdf_2.2.0 limma_2.14.6 affy_1.18.2
>>>>> [25] preprocessCore_1.2.1 affyio_1.8.1 Biobase_2.0.1
>>>>>
>>>>> loaded via a namespace (and not attached):
>>>>> [1] cluster_1.11.11 XML_1.94-0.1
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Sep 26, 2008 at 8:32 AM, James W. MacDonald
>>>>> <jmacdon at med.umich.edu>wrote:
>>>>>
>>>>> Hi Hui-Yi,
>>>>>
>>>>>> Hui-Yi Chu wrote:
>>>>>>
>>>>>> Hi everyone,
>>>>>>
>>>>>>> Apologize first for probably simple question below.
>>>>>>> Thanks for Jim's help, I've got some html files from limma2annaffy
>>>>>>> function,
>>>>>>> however, I am not pretty sure how to interpret the result of "fold
>>>>>>> change".
>>>>>>> Please correct me if I am wrong. Based on my knowledge, after limma,
>>>>>>> for
>>>>>>> example in coef=1, the fold change represents the contrast of the
>>>>>>> expression
>>>>>>> value means from two groups. But the html result made me confused:
>>>>>>>
>>>>>>> The header of html is like this:
>>>>>>> probes Description Pubmed Gene Ontology Pathway
>>>>>>> t-statistic
>>>>>>> P-value Fold-change array1 array1 array2 array2
>>>>>>>
>>>>>>>
>>>>>>> The question is why the fold change is not really the value that
>>>>>>> (mean
>>>>>>> of
>>>>>>> group1)- (mean of group2) ?? If it is because of permutation??
>>>>>>>
>>>>>>> The fold change *is* mean(goup1) - mean(group2). If you show us your
>>>>>>>
>>>>>> code
>>>>>> and a snippet of the output we might be able to help (please see the
>>>>>> posting
>>>>>> guide).
>>>>>>
>>>>>> I've read limma user guide and ?limma2annaffycore, so I believe the
>>>>>> value
>>>>>>
>>>>>> of
>>>>>>> array 1 and 2 are expression value.
>>>>>>> By the way, the function of pflt filter genes by p value, but if I
>>>>>>> want
>>>>>>> to
>>>>>>> filter gene by adj.p.val??
>>>>>>>
>>>>>>> I don't know of any function pflt. However, if you mean the argument
>>>>>>>
>>>>>> pfilt,
>>>>>> then it *does* filter by adjusted p-value if in fact you are adjusting
>>>>>> the
>>>>>> p-value to begin with.
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Jim
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thank you very much!!
>>>>>>
>>>>>>> Hui-Yi
>>>>>>>
>>>>>>> [[alternative HTML version deleted]]
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Bioconductor mailing list
>>>>>>> Bioconductor at stat.math.ethz.ch
>>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>>>> Search the archives:
>>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>> James W. MacDonald, M.S.
>>>>>> Biostatistician
>>>>>> Hildebrandt Lab
>>>>>> 8220D MSRB III
>>>>>> 1150 W. Medical Center Drive
>>>>>> Ann Arbor MI 48109-0646
>>>>>> 734-936-8662
>>>>>>
>>>>>>
>>>>>>
>>> Thank you with sincere appreciation!!!
>>> Hui-Yi
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>
>> --
>> James W. MacDonald, M.S.
>> Biostatistician
>> Hildebrandt Lab
>> 8220D MSRB III
>> 1150 W. Medical Center Drive
>> Ann Arbor MI 48109-0646
>> 734-936-8662
>>
>
>
[[alternative HTML version deleted]]
_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list