[BioC] edgeR: design matrix for different condition

KJ Lim jinkeanlim at gmail.com
Mon Jul 30 13:01:07 CEST 2012


Dear Prof Gordon,

Good day.

I have read the section 8.5 of the Limma manual as you suggested in
previous emails. Thanks.

If I understand you correctly, you would suggest me to carry out my DE
analysis with limma package if I would like to learn the which genes
are express in treeHS compare to treeLS (vice versa) at i.e. 24H.

May I ask how can I generate the EList, "eset", in order to fit in the
linear model as mentioned in the limma manual

  fit <- lmFit(eset, design)
  fit <- eBayes(fit)

Please correct me if I'm wrong.

Thank you very much for your time and help.

Best regards,
KJ Lim



On 27 July 2012 02:31, Gordon K Smyth <smyth at wehi.edu.au> wrote:
> Dear KJ Lim,
>
> Thanks for the rephrasing, which is clearer.  I would have liked you to
> mention however whether you read the documentation that I refered you do.
>
>
> On Thu, 26 Jul 2012, KJ Lim wrote:
>
>> Dear Prof Gordon,
>>
>> Good day. Thanks for your prompt replied.
>>
>> Please allow me to rephrase my previous question.
>>
>> This model,  ~tree+treat,  is assumed effect of the time of treatment
>> will be the same irrespective of genotype.
>
>
> Yes, this model makes that assumption.
>
>
>> Thus, test for the coef=3 will gives me the differential expression at 96H
>> irrespective of genotype.
>
>
> Actually coef=3 refers to 24H, according to the design matrix in your
> original email.  A test for coef=3 would test for DE at 24H vs 0H,
> irrespective of genotype.  But only if the assumption mentioned above is
> true, which is unlikely given the rest of your email.
>
>
>> I would like to learn the differential expression of treeHS vs treeLS
>> at specific time points i.e. 24H or if possible across the whole time
>> points. Should I fit my model as
>>
>>  model A: ~tree*treat    OR   model B: ~tree+tree:treat ?
>
>
> These models are equivalent, so it is just a matter of convenience which one
> you use, as I tried to explain in the limma documentation I refered you to.
>
>
>> The design matrix columns of model B give:
>>  "(Intercept)"  "treeLS"  "treeHS:treat03H"  "treeLS:treat03H"
>> "treeHS:treat24H"  "treeLS:treat24H"  "treeHS:treat96H"
>> "treeLS:treat96H"
>>
>> Am I doing right if I fit the model B and test for the coef=3:4 to
>> learn the differential expression of treeHS vs treeLS at specific time
>> points i.e. 03H? Does this test could tells me which set of genes
>> up/down-regulated in treeLS or treeHS?
>
>
> Yes.  Coef 3 tests for a 3H effect in HS.  Coef 4 tests for a 3H in LS. So
> testing for both coefficients coef=3:4 tests for a 3H effect in either
> treeLS or treeHS.
>
> Best wishes
> Gordon
>
>
>> I'm not good in statistic and I'm learning,  kindly please correct me
>> if I'm wrong.
>>
>> Thank you very much for your time and suggestion.
>>
>> Best regards,
>> KJ Lim
>>
>>
>>
>>
>> On 26 July 2012 03:39, Gordon K Smyth <smyth at wehi.edu.au> wrote:
>>>
>>>
>>> Dear KJ Lim,
>>>
>>> I don't quite understand your question, because you seem to be asking for
>>> something that isn't a test for differential expression, which is what edgeR
>>> does.  You question "I would like to learn the genes are express in treeHS
>>> but not treeLS and vice versa" seems to be about absolute expression levels.
>>> edgeR doesn't test for genes that are not expressed in particular
>>> conditions.
>>>
>>> I'll refer you to the limma section on interaction models in case that
>>> helps, see Section 8.5 of:
>>>
>>>
>>> http://bioconductor.org/packages/2.11/bioc/vignettes/limma/inst/doc/usersguide.pdf
>>>
>>> Setting up a design matrix is the same for edgeR as it is for limma.
>>>
>>> Best wishes
>>> Gordon
>>>
>>>> Date: Tue, 24 Jul 2012 17:12:00 +0300
>>>> From: KJ Lim <jinkeanlim at gmail.com>
>>>> To: Bioconductor mailing list <bioconductor at r-project.org>
>>>> Subject: [BioC] edgeR: design matrix for different condition
>>>>
>>>> Dear the edgeR community,
>>>>
>>>> Good day.
>>>>
>>>> I'm analyzing my RNA-Seq experiment with edgeR. My study has 2
>>>> different genotypes (treeHS, treeLS) and time of treatment (0H, 3H,
>>>> 24H, 96H).
>>>>
>>>> I first assumed that the treatment effect will be the same for each
>>>> genotype and I have the design matrix as:
>>>>
>>>>  design <- model.matrix(~tree+treat)
>>>>
>>>>> design
>>>>
>>>>
>>>>   (Intercept)  treat03H  treat24H  treat96H  treeLS
>>>> 1             1        0        0        0      0
>>>> 2             1        0        0        0      0
>>>> 3             1        1        0        0      0
>>>> 4             1        1        0        0      0
>>>> 5             1        0        1        0      0
>>>> 6             1        0        1        0      0
>>>> 7             1        0        0        1      0
>>>> 8             1        0        0        1      0
>>>> 9             1        0        0        0      1
>>>> 10           1        0        0        0      1
>>>> 11           1        1        0        0      1
>>>> 12           1        1        0        0      1
>>>> 13           1        0        1        0      1
>>>> 14           1        0        1        0      1
>>>> 15           1        0        0        1      1
>>>> 16           1        0        0        1      1
>>>>
>>>> I used coef=4 to test for the differential expressions between treeHS
>>>> and treeLS within the time of treatment, coef=3 to learn the
>>>> differential expressions in 2 genotypes at time of treatment 96H.
>>>>
>>>> ** I would like to learn the genes are express in treeHS but not
>>>> treeLS and vice versa at certain time of treatment let's say 24H or
>>>> across the whole time of treatment, should I have the design matrix as
>>>> below or more steps I need to do?
>>>>
>>>>  design <- model.matrix(~tree*treat)
>>>>
>>>> Kindly please light me on this. I appreciate very much for your help and
>>>> time.
>>>>
>>>> Have a nice day.
>>>>
>>>> Best regards,
>>>> KJ Lim
>
>
> ______________________________________________________________________
> The information in this email is confidential and inte...{{dropped:6}}



More information about the Bioconductor mailing list