[BioC] Limma analysis of focused arrays vs. whole genome arrays
Mike Schaffer
mschaff at bu.edu
Tue Jun 7 15:33:51 CEST 2005
Hi,
The lab I work with has used "whole genome" human arrays (~18,000
genes) for a couple years and I have helped with the analysis using
Limma. Now, due to costs, they are now considering switching from
whole genome arrays to focused arrays with ~400 genes of interest
(selected from the whole-genome array results).
The obvious analysis problems with a focused array where most genes are
changing are:
1. LOESS normalization assumes most genes are not changing. If most of
the genes are expected to change, there is no basis to recenter the
data around zero. The response from the lab was that they would be
willing to include 100-150 genes that are not expected to change.
2. The B-statistic in Limma requires a parameter indicating a certain
fraction of genes are changing. The corresponding moderated
t-statistic uses the data from all genes to moderate the standard error
in the t calculation. Both of these could change dramatically if most
of the genes on the array are changing.
My questions are:
1. Are my concerns valid and are there ways around around them? Are
there other analysis pitfalls with this scenario?
2. Can Limma handle situations where most of an array is expected to
change? What modifications, if any, need to be made to the Limma
analysis to account for this?
3. Alternatively, is there a more appropriate statistical package to
use in this case?
Thanks.
--
Mike
More information about the Bioconductor
mailing list