[BioC] model.matrix
Gordon K Smyth
smyth at wehi.EDU.AU
Fri Mar 7 01:12:59 CET 2014
Dear Mike,
> Mike Miller mike.bioc32 at gmail.com
> Thu Mar 6 12:27:01 CET 2014
>
> Dear All,
>
> This question is regarding model.matrix function and the contrasts which
> can be made after applying it. I used this function as a part of edgeR
> package.
>
> Here are 2 designs:
>
> > design_1=model.matrix(~0+ Control+ Gender+ Location, data=data_2)
> > colnames(design_1)
> [1] "Control0" "Control1" "Gender1" "Location1"
>
> How could I get the contrast Gender1-Gender0, shouldn't it be included
> in the columns since there is no intercept?
It is included. It is called "Gender1". By default, model.matrix()
produces contrasts relative to the first level of each factor.
> If I want to see the contrast (Gender1-Gender0), I could change the
> order
> of the factors in the formula:
> > design_2=model.matrix(~0+ Gender+ Control+ Location, data=data_2)
> > colnames(design_2)
> [1] "Gender0" "Gender1" "Control1" "Location1"
>
> But then there is a question: is there any mathematical difference
> between 2 designs?
Yes, there is. Now "Control1" represents Control1-Control0 but "Gender1"
is just Gender1.
I would suggest that you only use "0+" for oneway layouts, not for
additive models with multiple factors.
> If someone knows a link/book where the function model.matrix is well and
> in details explained, please let me know.
The main document perhaps is Section 11.1 of the Introduction to R manual
that comes with R. But I doubt you will find that fully helpful. You can
also try asking questions on the R-help mailing list.
But really, there are two main things you need to understand to follow
design matrices reasonably well.
First, each factor that you add to the linear model adds one fewer column
than the factor has levels. You start with an intercept. Adding Control
adds one further column (because Control has two levels). Adding Gender
adds two columns (because Gender has three levels). Adding Location adds
1 column (because Location has two levels). That's four columns in total.
No matter how you parametrize you must have exactly 4 columns. You can
try fiddling the model by using "0+", in that case the first level of the
first factor enters in place of the intercept. But you can't expect the
first level of any other factor to appear because that would make more
than 4 columns.
Second, model.matrix() compares each level back to the first level of each
factor. So simply using ~Control+Gender+Location will gives you
coefficients representing Control1-Control0, Gender1-Gender0,
Gender2-Gender0 and Location1-Location0. That's not too difficult!
Best wishes
Gordon
> Thank you very much in advance!
> Mike
______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}
More information about the Bioconductor
mailing list