[BioC] edgeR: generating a correct design matrix - multifactorial design

Natasha Sahgal nsahgal at well.ox.ac.uk
Thu Jul 26 14:29:08 CEST 2012


Dear Prof. Gordon and List,

I have an RNA-Seq expt for which I'd like to use edgeR, as it is multifactorial in design. 
Having gone through the user guide, I am a bit confused as to how to generate the model for my expt.

The expt: 2 cell-lines (mut,wt), 2 conditions(stimulated, unstimulated), n=2 in each group.
My aim: to detect DE genes based on the effect of stimulus on mut cells.

Thus,
dat
  Sample Group Stim
1      1    WT   No
2      2    WT   No
3      3   WT+  Yes
4      4   WT+  Yes
5      5   Mut   No
6      6   Mut   No
7      7  Mut+  Yes
8      8  Mut+  Yes

Now, if this were array data the model would be:
design = model.matrix(~dat$Group)
and whilst fitting the model I could make a contrast such as (Mut+ - Mut) - (WT+ - WT)

I am not sure how to do this for the RNA-Seq data (i.e. what should the model be? And what coefficients should I pull out?)

Whether the model should be:
1) model.matrix(~dat$Group) and somehow in the glmLRT function specify the above contrast in some manner?

2) model.matrix(~dat$Group+dat$Group*dat$Stim) (coefficient/contrast?)

3) model.matrix(~dat$Group*dat$Stim) (coefficient/contrast?)


I'd appreciate any help and advice.


Many Thanks,
Natasha



More information about the Bioconductor mailing list