[BioC] design matrix

Wed May 23 13:13:26 CEST 2007

At 08:15 PM 23/05/2007, Lev Soinov wrote:
>Dear Gordon,
>
>When using a script that I already asked about (below), I noticed 
>the following.
>
>My Data matrix is for 30 samples, the first column represents probe 
>IDs. I use for the unordered dataset (samples are not ordered in the 
>Data file according to the treatments) the following script:
> > design <- model.matrix(~0 
> +factor(c(3,2,4,5,1,4,3,6,6,1,2,1,3,5,4,2,5,6,3,4,5,6,1,2,3,4,5,6,1,2)))
> > colnames(design) <- c("Untreated","L1","L2","L1A","L2A","A")
> > contrast.matrix <- makeContrasts(L1-Untreated, L2-Untreated, 
> L1A-Untreated, L2A-Untreated, A-Untreated, L1A-A, L2A-A, L1A-L1, 
> L2A-L2, levels=design)
>
>For the ordered data it should produce the same results in the 
>topTable as the script:
> > design <- model.matrix(~0 
> +factor(c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,4,4,4,4,4,5,5,5,5,5,6,6,6,6,6,6)))
> > colnames(design) <- c("Untreated","L1","L2","L1A","L2A","A")
> > contrast.matrix <- makeContrasts(L1-Untreated, L2-Untreated, 
> L1A-Untreated, L2A-Untreated, A-Untreated, L1A-A, L2A-A, L1A-L1, 
> L2A-L2, levels=design)
>
>However there are some differences, especially in the AveExpr column:
>
>1. unordered:
>           ID     logFC  AveExpr         t      P.Value    adj.P.Val        B
>168840 -5.302214 12.51356 -27.83821 3.590400e-09 2.717735e-05 11.23314
>101140 -5.175045 12.51114 -27.67796 3.756415e-09 2.717735e-05 11.20135
>123375 -4.265825 14.04540 -27.65702 3.778740e-09 2.717735e-05 11.19717
>194271 -4.237919 16.31592 -27.05930 4.483564e-09 2.717735e-05 11.07557
>142723 -5.529549 12.89764 -26.63683 5.071066e-09 2.717735e-05 10.98684
>2. ordered:
>           ID     logFC  AveExpr         t      P.Value    adj.P.Val        B
>168840 -5.302013 10.23113 -27.83760 3.589533e-09 2.716231e-05 11.23349
>101140 -5.174833 10.22872 -27.68107 3.751573e-09 2.716231e-05 11.20243
>123375 -4.265788 11.76289 -27.65844 3.775669e-09 2.716231e-05 11.19792
>194271 -4.237924 14.03339 -27.06015 4.480631e-09 2.716231e-05 11.07621
>142723 -5.529383 10.61520 -26.63693 5.068864e-09 2.716231e-05 10.98732
>
>Could you please comment on this? Is it essential to order samples 
>in the data file according to treatments before running lmFit? I 
>thought that this should not matter as long as one gets the actual 
>order in the design matrix right.

Sample order makes no different to limma. If your results are 
different, then you've made a mistake.

Best wishes
Gordon

>With kind regards,
>Lev.
>
>My script:
>s<-scan("Data.txt",what='character')
>sm<-matrix(s,byrow=TRUE,ncol=31)
>rownames(sm)<-sm[,1]
>sm<-sm[,2:ncol(sm)]
>snn<-apply(sm,2,as.numeric)
>rownames(snn)<-rownames(sm)
>signals<-snn
><?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />
>temp<-vsn(signals)
>exprs(temp)<-log2(exp(exprs(temp)))
>
>design <- model.matrix(~0 
>+factor(c(3,2,4,5,1,4,3,6,6,1,2,1,3,5,4,2,5,6,3,4,5,6,1,2,3,4,5,6,1,2)))
>
>colnames(design) <- c("Untreated","L1","L2","L1A","L2A","A")
>contrast.matrix <- makeContrasts(L1-Untreated, L2-Untreated, 
>L1A-Untreated, L2A-Untreated, A-Untreated, L1A-A, L2A-A, L1A-L1, 
>L2A-L2, levels=design)
>
>fit <- lmFit(temp, design)
>fit2 <- contrasts.fit(fit, contrast.matrix)
>fit2 <- eBayes(fit2)
>topTable(fit2, adjust="BH")
>
> > design unordered:
>    Untreated L1 L2 L1A L2A A
>1          0  0  1   0   0 0
>2          0  1  0   0   0 0
>3          0  0  0   1   0 0
>4          0  0  0   0   1 0
>5          1  0  0   0   0 0
>6          0  0  0   1   0 0
>7          0  0  1   0   0 0
>8          0  0  0   0   0 1
>9          0  0  0   0   0 1
>10         1  0  0   0   0 0
>11         0  1  0   0   0 0
>12         1  0  0   0   0 0
>13         0  0  1   0   0 0
>14         0  0  0   0   1 0
>15         0  0  0   1   0 0
>16         0  1  0   0   0 0
>17         0  0  0   0   1 0
>18         0  0  0   0   0 1
>19         0  0  1   0   0 0
>20         0  0  0   1   0 0
>21         0  0  0   0   1 0
>22         0  0  0   0   0 1
>23         1  0  0   0   0 0
>24         0  1  0   0   0 0
>25         0  0  1   0   0 0
>26         0  0  0   1   0 0
>27         0  0  0   0   1 0
>28         0  0  0   0   0 1
>29         1  0  0   0   0 0
>30         0  1  0   0   0 0
>
> > design ordered:
>    Untreated L1 L2 L1A L2A A
>1          1  0  0   0   0 0
>2          1  0  0   0   0 0
>3          1  0  0   0   0 0
>4          1  0  0   0   0 0
>5          1  0  0   0   0 0
>6          0  1  0   0   0 0
>7          0  1  0   0   0 0
>8          0  1  0   0   0 0
>9          0  1  0   0   0 0
>10         0  1  0   0   0 0
>11         0  0  1   0   0 0
>12         0  0  1   0   0 0
>13         0  0  1   0   0 0
>14         0  0  1   0   0 0
>15         0  0  0   1   0 0
>16         0  0  0   1   0 0
>17         0  0  0   1   0 0
>18         0  0  0   1   0 0
>19         0  0  0   1   0 0
>20         0  0  0   0   1 0
>21         0  0  0   0   1 0
>22         0  0  0   0   1 0
>23         0  0  0   0   1 0
>24         0  0  0   0   1 0
>25         0  0  0   0   0 1
>26         0  0  0   0   0 1
>27         0  0  0   0   0 1
>28         0  0  0   0   0 1
>29         0  0  0   0   0 1
>30         0  0  0   0   0 1
>
> >
> > > sessionInfo()
> >R version 2.4.1 (2006-12-18)
> >i386-pc-mingw32
> >
> >locale:
> >LC_COLLATE=English_United
> >Kingdom.1252;LC_CTYPE=English_United
> >Kingdom.1252;LC_MONETARY=English_United
> >Kingdom.1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252
> >
> >attached base packages:
> >[1]
> >"tools" "stats" "graphics" "grDevices"
> >"utils" "datasets" "methods" "base"
> >
> >other attached packages:
> > vsn Biobase limma
> >"1.12.0" "1.12.2" "2.9.8"
>
>
>
>Copy addresses and emails from any email account to Yahoo! Mail - 
>quick, easy and free. 
><http://us.rd.yahoo.com/mail/uk/taglines/default/trueswitch/*http://uk.docs.yahoo.com/trueswitch2.html>Do 
>it now...