[BioC] limma question: direct two-color design & modeling individual subject effects

Paul Shannon pshannon at systemsbiology.org
Mon Apr 30 05:08:15 CEST 2007


I've been working on and off for a few months with limma on a set of 28 2-color
arrays made up of 14 dye-swap pairs.  The main contrast in the arrays is between
malaria parasite RNA extracted from maternal and from juvenile hosts;
all the arrays can be described in these terms.  This is the main effect we
are studying, and limma is very helpful in elucidating it.

The arrays can be more specifically described as comparisons between specific
maternal subjects and specific juvenile subjects -- between different
combinations of three mothers (m918, m836, m920) with six children (c073, c135,
c140, c372, c451, c413, c425).  I have trouble fitting models to some of these
genes, failing to isolatethe effects of individual subjects where their effects seem
to be strong.

(A good example can be seen at http://gaggle.systemsbiology.net/pshannon/tmp/7346.png,
where the effect of m920 is pronounced, but apparently missed by my lmFit/eBayes model.)

Here are some few lines from each of the matrices I use that lead to that plot.

---- head (targets)

  SlideNumber      Name            FileName      Cy3      Cy5 Mother Child
1        2254 slide2254 m918c073-cy3cy5.gpr maternal juvenile   m918  c073
2        2261 slide2261 m918c073-cy5cy3.gpr juvenile maternal   m918  c073
3        2258 slide2258 m836c073-cy3cy5.gpr maternal juvenile   m836  c073
4        2265 slide2265 m836c073-cy5cy3.gpr juvenile maternal   m836  c073
5        2341 slide2341 m836c135-cy3cy5.gpr maternal juvenile   m836  c135
6        2344 slide2344 m836c135-cy5cy3.gpr juvenile maternal   m836  c135

----- head (design)

  mother child maternal
1   m918  c073      Low
2   m918  c073     High
3   m836  c073      Low
4   m836  c073     High
5   m836  c135      Low
6   m836  c135     High

---- create the model

model <- model.matrix (~maternal + mother + child, design)

head (model)
  (Intercept) maternalHigh motherm918 motherm920 childc135 childc140 childc372 childc413 childc425 childc451
1           1            0          1          0         0         0         0         0         0         0
2           1            1          1          0         0         0         0         0         0         0
3           1            0          0          0         0         0         0         0         0         0
4           1            1          0          0         0         0         0         0         0         0
5           1            0          0          0         1         0         0         0         0         0
6           1            1          0          0         1         0         0         0         0         0

---- fit the data

fit <- lmFit (MA, model)
efit <- eBayes (fit)

# one example of poor fit.  with probe 7346, the m920 effect is very strong, but the coefficients
# don't reflect that.  instead, most of the influence is allocated to the maternal effect, which 
# nicely models all the comparisons except those involving m920.  the fit there is strikingly
# poor, with high residuals. I can't make sense of the tiny motherm920 coefficient:

> efit$coef [7346,]
 (Intercept) maternalHigh   motherm918   motherm920    childc135    childc140    childc372    childc413    childc425    childc451
 -3.62867124   7.49268173   0.24858455  -0.02635289  -0.67898282  -0.24566235  -0.24673763   0.10618603  -0.37520911  -0.02761610

The plot of the fitted & actual values can be found at 

      http://gaggle.systemsbiology.net/pshannon/tmp/7346.png

I may be over-interpreting, or mis-interpreting, or even misrepresenting all this.  But after lots
of head scratching, lots of reading and experiments, I can't get the coefficients to do what I think
they should.  Perhaps it's my failure to use a contrast matrix.  Or something else.

Any suggestions?  I'll be really grateful for any advice.

Thanks!

 - Paul



More information about the Bioconductor mailing list