[BioC] Nested design in limma
Naomi Altman
naomi at stat.psu.edu
Wed Apr 18 15:36:03 CEST 2007
Dear Caroline,
The key to the response about averaging is that in a purely nested
completely balanced design like this, with random effects at each
level but the highest, the analysis of each factor of the design
depends only on the averages within the levels below. So, hypotheses
about the differences between groups can answered using the genewise
averages of all the observations for each patient.
The levels of subsampling can be used to determine the main sources
of variation in the study, which is useful for planning further
studies, but not for testing differences between groups. If you need
to understand the sources of variance in your study, you could handle
this in limma by analyzing each group separately, level by
level. Alternatively, you could use SAS to estimate the variance
components for each level of replication. I think that MAANOVA in
Bioconductor may also do this analysis, but I have not used it.
--Naomi
At 10:47 PM 4/17/2007, Kasper Daniel Hansen wrote:
>On Apr 17, 2007, at 2:23 AM, <caroline.truntzer at chu-lyon.fr>
><caroline.truntzer at chu-lyon.fr> wrote:
>
> > Dear list,
> > My question is a follow-up of the thread about handling nested
> > design using
> > limma posted by Tao Shi (please see
> > https://stat.ethz.ch/pipermail/bioconductor/2007-January/015717.html).
> > I have a data set which has a similar design as Tao Shi: 14
> > patients (7 in
> > one group, 7 in another group), 2 biological samples for each patients
> > (corresponding to 2 different extractions), and each extraction is
> > hybridized to 2 arrays and I have triplicate sets of probes. I
> > would like
> > to identify genes that have differential expression between the 2
> > groups.
> > I read the responses written to Tao on how to analyse this data
> > set, but
> > there are some things I didn't understand.
> > The advice was to use avedups() to average over the triplicate
> > probes, and
> > then to treat the patients as biological replicates (as blocks using
> > duplicateCorrelation). But by doing so I do not understand how the two
> > other replication levels are treated, that is extraction and
> > hybridization.
> > Is it possible to keep the information of this two replication
> > levels in
> > the analysis? Is it possible to set different levels in blocks
> > (given the
> > help for the duplicateCorrelation fonction I think it is not
> > possible but
> > perhaps someone found a mean to do that)?
> > Moreover I think I'm confused with what should be put in the design
> > matrix
> > and what should rather be put in the blocks vector. I'm sorry for this
> > naive question...
>
> > Thanks in advance for your help
> > Caroline
>
>This will be a quick answer. You are right that you have many levels
>of dependency in your design: 3 probes measuring the same transcript,
>2 samples per patient and 2 hybridizations per sample. That should
>(from a certain perspective) be analyzed using a model with several
>random effects (ie. several levels of dupCor). Unfortunately limma
>cannot handle more than one level, so in that case you need to focus
>on what dependency you think is most important to model. The
>recommendations in the thread you are referring to (which I only
>skimmed _very_ quickly) essentially deals with this question.
>
>Kasper
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives:
>http://news.gmane.org/gmane.science.biology.informatics.conductor
Naomi S. Altman 814-865-3791 (voice)
Associate Professor
Dept. of Statistics 814-863-7114 (fax)
Penn State University 814-865-1348 (Statistics)
University Park, PA 16802-2111
More information about the Bioconductor
mailing list