[BioC] limma duplicateCorrelation - unbalanced paired design

Fri Nov 15 18:05:47 CET 2013

Dear all,

I have a question regarding duplicateCorrelation when applied to a paired design.

I am analyzing 8 arrays (Agilent gene expression): 2 experimental groups A and B, samples are paired with one another: A1 with B1, A2 with B2 etc.
Sample B2 has to be dropped because of quality issues, so one sample in group A missing its corresponding B.

I have first processed an unpaired analysis of the data before processing a paired analysis (given the information I accessed later on).

targets:

FileName	Group	Pairs
A1.txt	A	1
A2.txt	A	2
A3.txt	A	3
A4.txt	A	4
B1.txt	B	1
B3.txt	B	3
B4.txt	B	4

## Code

  design <- model.matrix(~0+targets$Group)
  colnames(design) <- unique(targets$ Group)

# UNPAIRED ANALYSIS #

  fit <- lmFit(expr, design=design)
  contrast.matrix <- makeContrasts(contrasts="B-A", levels=design)
  fit <- contrasts.fit(fit, contrast.matrix)
  fit <- eBayes(fit)

  	top.unpaired <- topTable(fit, coef="B-A", number=nrow(expr), sort.by="none")

# PAIRED ANALYSIS #

  corfit <- duplicateCorrelation(expr, design, block=targets$Pairs)
  fit <- lmFit(expr, design=design, block = targets$Pairs, cor = corfit$consensus)
  contrast.matrix <- makeContrasts(contrasts="B-A", levels=design)
  fit <- contrasts.fit(fit, contrast.matrix)
  fit <- eBayes(fit)

	top.paired <- topTable(fit, coef="B-A", number=nrow(expr), sort.by="none")

## Results

top.unpaired[1,]
    logFC  AveExpr         t    P.Value adj.P.Val         B
0.07913307 10.30572 0.7052201 0.49455794 0.9999397 -4.767979

top.paired[1,]
    logFC  AveExpr        t     P.Value adj.P.Val         B
0.08180912 10.30572 0.779451 0.451223001 0.9996952 -4.769007

What I do not understand is why the two logFC for the same probe are not similar?
I can easily calculate logFC from the unpaired results, but not for the paired.

I guess the fact that the paired design is "unbalanced" should be the reason; when I processed paired and unpaired analysis before removing sample B2 - 4 samples per experimental group - the logFC are identical.

I would like to understand how it works: is there some weight applied to the samples in this case?
I would also like to make sure that I don't make any mistake processing a paired analysis on this unbalanced paired design: should I also drop sample A2??

Any help and explanation is welcome.

Thanks!

Sarah