[BioC] [Bioc-devel] help with limma design
James W. MacDonald
jmacdon at med.umich.edu
Mon Jun 22 16:50:32 CEST 2009
Hi Kaiyu,
First off, this isn't an appropriate question for Bioc-devel. That list
is intended for questions about developing Bioconductor packages, not
questions about how to use the packages. I have re-directed to the
correct list.
Kaiyu Shen wrote:
> Hello, folks:
> I am now using limma package to analyze the two-color arrays. Here are
> the six arrays that I have:
>
> # Cy3 Cy5
> Array1 MU1 WT
> Array2 WT MU1
> Array3 MU2 WT
> Array4 WT MU2
> Array5 MU3 WT
> Array6 WT MU3
>
> What I want to analyze is to study the MU1 vs WT.
> I tried two analysis ways, to make it easier, I have not introduced any
> pre-processing methods:
>
> A. Just have the first two arrays for analysis
>
> # Cy3 Cy5
> Array1 MU1 WT
> Array2 WT MU1
>
> object=readTargets("limma.txt")
> RG=read.maimages(object,source="agilent")
> MA=normalizeWithinarray(RG)
> design=c(1,-1)
> fit=lmFit(MA,design)
> fit=eBayes(fit)
> topTable(fit)
>
>
> B. I include all six arrays to have other analysis simultaneously
>
> # Cy3 Cy5
> Array1 MU1 WT
> Array2 WT MU1
> Array3 MU2 WT
> Array4 WT MU2
> Array5 MU3 WT
> Array6 WT MU3
>
> object=readTargets("limma.txt")
> RG=read.maimages(object,source="agilent")
> MA=normalizeWithinarray(RG)
> design=cbind(mu1=c(1,-1,0,0,0,0),mu2=c(0,0,1,-1,0,0),mu3=c(0,0,0,0,1,-1))
> cont.matrix=makeContrasts(mu1,mu2,mu3,levels=design)
> fit=lmFit(MA,design)
> fit2=contrasts.fit(fit,cont.matrix)
> fit2=eBayes(fit2)
> topTable(fit2,coef=1) #to get the first comparison (array1 vs array2)
>
>
> However, these two methods do not give me the same results.
> Would somebody give me some suggestions of these two methods?
The differences are primarily due to the fact that you are fitting a
linear model here, so the denominator of your t-statistic is a measure
of the variability within each of the groups you have defined. In the
first case you have only two groups, whereas in the second case you have
six groups.
How this affects your results depends on the data. In the second case
you have increased the amount of data used to compute the sums of
squares of error (SSE), which will tend to make this value smaller, and
might result in more genes being significant (smaller denominator =>
larger t-statistic => more genes). However, if the variability within
the second two groups is much higher than in the first, then this will
tend to inflate the SSE, and you will get fewer genes.
Best,
Jim
>
> Thank you very much
>
> _______________________________________________
> Bioc-devel at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
--
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
More information about the Bioconductor
mailing list