[BioC] limma2annaffy output and heatmap questions
James W. MacDonald
jmacdon at med.umich.edu
Fri Sep 26 22:47:19 CEST 2008
Hui-Yi Chu wrote:
> Hello Jim,
>
>
>>> Thank you for the answer of p.value! and sorry for no codes in mail. The
>>> followings are one example of my data and codes.
>>>
>>> *Example *(partially copied from my result table when coef=1, 1st
>>> probeID):
>>> t-statistic P-value Fold-change mut.a mut.a wt1
>>> wt2
>>> 108.84 0 10.28 9.90436 9.88196 10.2249
>>> 10.1021
>>>
>> This isn't possible. You can't have a fold change of 10.28 for these
>> values. The actual number will be something like -0.2 (The mean of 9.9 and
>> 9.88 minus the mean of 10.22 and 10.1 *cannot* be 10.28).
>>
>
> I completely agree with you! And this is why I don't understand the
> table as well.
> In addition, I've compared the last four columns of this table with
> my original normalized expression values, they are identical, which means
> there might be something with later steps such as contrast? Was something
> wrong with my codes??
>
>
>> *Codes*:
>>> ## data importing
>>> library("affy")
>>> library("limma")
>>> targets <- readTargets("fed_total.txt")
>>> aaa <- ReadAffy(filenames = targets $ filename)
>>> eset <- rma(aaa)
>>> pData(eset)
>>> samples <- data.frame(genotype = c("wt", "wt", "mut.a", "mut.a","mut.b",
>>> "mut.b"),replicate =c(1,2,1,2,1,2))
>>> varInfo <- data.frame(labelDescription=c("wt, mut.a, mut.b", "arb
>>> number"))
>>> pd <- new("AnnotatedDataFrame", data = samples, varMetadata = varInfo)
>>> phenoData(eset) <- new("AnnotatedDataFrame", data = samples, varMetadata =
>>> varInfo)
>>>
>>> ## filtration
>>> library("genefilter")
>>> f1 <- anyNA
>>> f2 <- pOverA(0.25, log2(100))
>>> ff <- filterfun(f1, f2)
>>> selected <- genefilter(eset, ff)
>>> esetsub <- eset[selected,]
>>>
>>> ## limma fit and contrast
>>> library("limma")
>>> design <- model.matrix(~0+factor(c(1,1,2,2,3,3)))
>>> colnames(design) <- c("wt", "mut.a","mut.b")
>>> fit <- lmFit(ef, design)
What is 'ef'? It appears you aren't using the ExpressionSet from above.
>>> fit2 <- eBayes(fit)
>>> contrast.matrix <- makeContrasts("mut.a vs wt" = mut.a-wt,
>>> "mut.b vs wt" = mut.b-wt,
>>> "Diff" = (mut.a-wt)-(mut.b-wt),
>>> levels=design)
>>>
>> You realize that your "Diff" is mut.a - mut.b, yes?
>> Yes, but I think there might be a better way to compare them on
>> the basis of wt.
No. (3-1)-(2-1) is completely identical to 3-2.
>>
>>
>> fit2 <- contrasts.fit (fit, contrast.matrix)
>>> fit3 <- eBayes(fit2)
>>>
>>> ## find DEG and draw heatmap
>>> x <- topTable(fit3, coef=1, adjust="fdr", sort.by="P", number=10000)
>>> y <- x[x$adj.P.Val < 0.01,]
>>> results <- decideTests (fit3, method="separate",
>>> adjust.method="BH",p.value=0.01,lfc=1)
>>>
>>> ## Cluster and heatmaps ------ *draw in three ways, but confused with
>>> results*
>>> ##1. heatmap(as.matrix(esetsub[y$ID,]), scale="none")
>>>
>>> #2. library("Heatplus")
>>> heatmap_2(esetsub[y$ID,]), scale="none")
>>> #3. library(gplots)
>>> heatmap.2(esetsub[y$ID,]), scale= "row", trace="none",
>>> density.info="none")
>>> ## or
>>> heatmap.2(esetsub[y$ID,]), scale= "none", trace="none",
>>> density.info="none")
>>>
>>> ## html file output
>>> library("affycoretools")
>>> library("yeast2.db")
>>> annotation(eset) <- "yeast2.db"
>>> limma2annaffy(eset, fit3, design, adjust = "fdr",
>>> contrast.matrix, annotation(eset), pfilt = 0.01, fldfilt= 1)
>>>
>>>
>>> *Questions*:
>>> I have tons of questions about heatmap, and I sincerely appreciate if
>>> anybody can give me some idea.
>>> 1. what is the scale?? I know there are some discussion in the
>>> maillist, but I am still confused. The result from scale=row is easy to
>>> interpret since we can see some blocks across samples, but how it works??
>>>
>> From ?heatmap.2
>>
>> scale: character indicating if the values should be centered and
>> scaled in either the row direction or the column direction,
>> or none. The default is '"row"' if 'symm' false, and
>> '"none"' otherwise.
>>
>> You might look at ?scale as well.
>>
>> 2. when I change the coef=2, the heatmap result is totally
>>> different, I know the coef=2 is comparing mut.b and wt, so what about the
>>> same group of genes in mut.a?
>>>
>> Not sure what this means. It's a different comparison, so you get different
>> genes. If you want the significant genes from the first comparison, just get
>> the significant genes and extract those from your ExpressionSet.
>> Thank you for the suggestions!
>>
>
>
>>
>> 3. The result of limma contains logFC, AvrExprs, t-statistic, p
>>> value,etc., which value that is used to draw cluster?? logFC or AveExprs??
>>> Some papers their color key display fold change, does that mean I have to
>>> do
>>> something to get the ratio before I draw heatmap?
>>>
>> Again, not sure what you are getting at. You aren't using anything from
>> limma except for the gene names that are being called significant. The data
>> are the expression values from your ExpressionSet.
>>
>>
>>
>>> 4. same question with html table, how does the fold change generate?
>>>
>> It is the difference between the average expression for each group.
>>
>>
>>
>>> Thank you very much!!!!!!!!!!
>>> Hui-Yi
>>>
>>>
>>> *sessionInfo()*
>>> R version 2.7.2 (2008-08-25)
>>> i386-pc-mingw32
>>>
>>> locale:
>>> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
>>> States.1252;LC_MONETARY=English_United
>>> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>>>
>>> attached base packages:
>>> [1] splines tools stats graphics grDevices utils datasets
>>> [8] methods base
>>>
>>> other attached packages:
>>> [1] yeast2.db_2.2.0 affycoretools_1.12.1 annaffy_1.12.1
>>> [4] KEGG.db_2.2.0 biomaRt_1.14.1 RCurl_0.9-3
>>> [7] GOstats_2.6.0 Category_2.6.0 RBGL_1.16.0
>>> [10] annotate_1.18.0 xtable_1.5-3 GO.db_2.2.0
>>> [13] AnnotationDbi_1.2.2 RSQLite_0.7-0 DBI_0.2-4
>>> [16] graph_1.18.1 affyPLM_1.16.0 gcrma_2.12.1
>>> [19] matchprobes_1.12.1 genefilter_1.20.0 survival_2.34-1
>>> [22] yeast2cdf_2.2.0 limma_2.14.6 affy_1.18.2
>>> [25] preprocessCore_1.2.1 affyio_1.8.1 Biobase_2.0.1
>>>
>>> loaded via a namespace (and not attached):
>>> [1] cluster_1.11.11 XML_1.94-0.1
>>>
>>>
>>>
>>> On Fri, Sep 26, 2008 at 8:32 AM, James W. MacDonald
>>> <jmacdon at med.umich.edu>wrote:
>>>
>>> Hi Hui-Yi,
>>>> Hui-Yi Chu wrote:
>>>>
>>>> Hi everyone,
>>>>> Apologize first for probably simple question below.
>>>>> Thanks for Jim's help, I've got some html files from limma2annaffy
>>>>> function,
>>>>> however, I am not pretty sure how to interpret the result of "fold
>>>>> change".
>>>>> Please correct me if I am wrong. Based on my knowledge, after limma, for
>>>>> example in coef=1, the fold change represents the contrast of the
>>>>> expression
>>>>> value means from two groups. But the html result made me confused:
>>>>>
>>>>> The header of html is like this:
>>>>> probes Description Pubmed Gene Ontology Pathway t-statistic
>>>>> P-value Fold-change array1 array1 array2 array2
>>>>>
>>>>>
>>>>> The question is why the fold change is not really the value that (mean
>>>>> of
>>>>> group1)- (mean of group2) ?? If it is because of permutation??
>>>>>
>>>>> The fold change *is* mean(goup1) - mean(group2). If you show us your
>>>> code
>>>> and a snippet of the output we might be able to help (please see the
>>>> posting
>>>> guide).
>>>>
>>>> I've read limma user guide and ?limma2annaffycore, so I believe the
>>>> value
>>>>
>>>>> of
>>>>> array 1 and 2 are expression value.
>>>>> By the way, the function of pflt filter genes by p value, but if I want
>>>>> to
>>>>> filter gene by adj.p.val??
>>>>>
>>>>> I don't know of any function pflt. However, if you mean the argument
>>>> pfilt,
>>>> then it *does* filter by adjusted p-value if in fact you are adjusting
>>>> the
>>>> p-value to begin with.
>>>>
>>>> Best,
>>>>
>>>> Jim
>>>>
>>>>
>>>>
>>>> Thank you very much!!
>>>>> Hui-Yi
>>>>>
>>>>> [[alternative HTML version deleted]]
>>>>>
>>>>> _______________________________________________
>>>>> Bioconductor mailing list
>>>>> Bioconductor at stat.math.ethz.ch
>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>> Search the archives:
>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>>
>>>>> --
>>>> James W. MacDonald, M.S.
>>>> Biostatistician
>>>> Hildebrandt Lab
>>>> 8220D MSRB III
>>>> 1150 W. Medical Center Drive
>>>> Ann Arbor MI 48109-0646
>>>> 734-936-8662
>>>>
>>>>
>
> Thank you with sincere appreciation!!!
> Hui-Yi
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
Hildebrandt Lab
8220D MSRB III
1150 W. Medical Center Drive
Ann Arbor MI 48109-0646
734-936-8662
More information about the Bioconductor
mailing list