[BioC] design matrix
Gordon Smyth
smyth at wehi.EDU.AU
Wed May 23 13:13:26 CEST 2007
At 08:15 PM 23/05/2007, Lev Soinov wrote:
>Dear Gordon,
>
>When using a script that I already asked about (below), I noticed
>the following.
>
>My Data matrix is for 30 samples, the first column represents probe
>IDs. I use for the unordered dataset (samples are not ordered in the
>Data file according to the treatments) the following script:
> > design <- model.matrix(~0
> +factor(c(3,2,4,5,1,4,3,6,6,1,2,1,3,5,4,2,5,6,3,4,5,6,1,2,3,4,5,6,1,2)))
> > colnames(design) <- c("Untreated","L1","L2","L1A","L2A","A")
> > contrast.matrix <- makeContrasts(L1-Untreated, L2-Untreated,
> L1A-Untreated, L2A-Untreated, A-Untreated, L1A-A, L2A-A, L1A-L1,
> L2A-L2, levels=design)
>
>For the ordered data it should produce the same results in the
>topTable as the script:
> > design <- model.matrix(~0
> +factor(c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,4,4,4,4,4,5,5,5,5,5,6,6,6,6,6,6)))
> > colnames(design) <- c("Untreated","L1","L2","L1A","L2A","A")
> > contrast.matrix <- makeContrasts(L1-Untreated, L2-Untreated,
> L1A-Untreated, L2A-Untreated, A-Untreated, L1A-A, L2A-A, L1A-L1,
> L2A-L2, levels=design)
>
>However there are some differences, especially in the AveExpr column:
>
>1. unordered:
> ID logFC AveExpr t P.Value adj.P.Val B
>168840 -5.302214 12.51356 -27.83821 3.590400e-09 2.717735e-05 11.23314
>101140 -5.175045 12.51114 -27.67796 3.756415e-09 2.717735e-05 11.20135
>123375 -4.265825 14.04540 -27.65702 3.778740e-09 2.717735e-05 11.19717
>194271 -4.237919 16.31592 -27.05930 4.483564e-09 2.717735e-05 11.07557
>142723 -5.529549 12.89764 -26.63683 5.071066e-09 2.717735e-05 10.98684
>2. ordered:
> ID logFC AveExpr t P.Value adj.P.Val B
>168840 -5.302013 10.23113 -27.83760 3.589533e-09 2.716231e-05 11.23349
>101140 -5.174833 10.22872 -27.68107 3.751573e-09 2.716231e-05 11.20243
>123375 -4.265788 11.76289 -27.65844 3.775669e-09 2.716231e-05 11.19792
>194271 -4.237924 14.03339 -27.06015 4.480631e-09 2.716231e-05 11.07621
>142723 -5.529383 10.61520 -26.63693 5.068864e-09 2.716231e-05 10.98732
>
>Could you please comment on this? Is it essential to order samples
>in the data file according to treatments before running lmFit? I
>thought that this should not matter as long as one gets the actual
>order in the design matrix right.
Sample order makes no different to limma. If your results are
different, then you've made a mistake.
Best wishes
Gordon
>With kind regards,
>Lev.
>
>My script:
>s<-scan("Data.txt",what='character')
>sm<-matrix(s,byrow=TRUE,ncol=31)
>rownames(sm)<-sm[,1]
>sm<-sm[,2:ncol(sm)]
>snn<-apply(sm,2,as.numeric)
>rownames(snn)<-rownames(sm)
>signals<-snn
><?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />
>temp<-vsn(signals)
>exprs(temp)<-log2(exp(exprs(temp)))
>
>design <- model.matrix(~0
>+factor(c(3,2,4,5,1,4,3,6,6,1,2,1,3,5,4,2,5,6,3,4,5,6,1,2,3,4,5,6,1,2)))
>
>colnames(design) <- c("Untreated","L1","L2","L1A","L2A","A")
>contrast.matrix <- makeContrasts(L1-Untreated, L2-Untreated,
>L1A-Untreated, L2A-Untreated, A-Untreated, L1A-A, L2A-A, L1A-L1,
>L2A-L2, levels=design)
>
>fit <- lmFit(temp, design)
>fit2 <- contrasts.fit(fit, contrast.matrix)
>fit2 <- eBayes(fit2)
>topTable(fit2, adjust="BH")
>
> > design unordered:
> Untreated L1 L2 L1A L2A A
>1 0 0 1 0 0 0
>2 0 1 0 0 0 0
>3 0 0 0 1 0 0
>4 0 0 0 0 1 0
>5 1 0 0 0 0 0
>6 0 0 0 1 0 0
>7 0 0 1 0 0 0
>8 0 0 0 0 0 1
>9 0 0 0 0 0 1
>10 1 0 0 0 0 0
>11 0 1 0 0 0 0
>12 1 0 0 0 0 0
>13 0 0 1 0 0 0
>14 0 0 0 0 1 0
>15 0 0 0 1 0 0
>16 0 1 0 0 0 0
>17 0 0 0 0 1 0
>18 0 0 0 0 0 1
>19 0 0 1 0 0 0
>20 0 0 0 1 0 0
>21 0 0 0 0 1 0
>22 0 0 0 0 0 1
>23 1 0 0 0 0 0
>24 0 1 0 0 0 0
>25 0 0 1 0 0 0
>26 0 0 0 1 0 0
>27 0 0 0 0 1 0
>28 0 0 0 0 0 1
>29 1 0 0 0 0 0
>30 0 1 0 0 0 0
>
> > design ordered:
> Untreated L1 L2 L1A L2A A
>1 1 0 0 0 0 0
>2 1 0 0 0 0 0
>3 1 0 0 0 0 0
>4 1 0 0 0 0 0
>5 1 0 0 0 0 0
>6 0 1 0 0 0 0
>7 0 1 0 0 0 0
>8 0 1 0 0 0 0
>9 0 1 0 0 0 0
>10 0 1 0 0 0 0
>11 0 0 1 0 0 0
>12 0 0 1 0 0 0
>13 0 0 1 0 0 0
>14 0 0 1 0 0 0
>15 0 0 0 1 0 0
>16 0 0 0 1 0 0
>17 0 0 0 1 0 0
>18 0 0 0 1 0 0
>19 0 0 0 1 0 0
>20 0 0 0 0 1 0
>21 0 0 0 0 1 0
>22 0 0 0 0 1 0
>23 0 0 0 0 1 0
>24 0 0 0 0 1 0
>25 0 0 0 0 0 1
>26 0 0 0 0 0 1
>27 0 0 0 0 0 1
>28 0 0 0 0 0 1
>29 0 0 0 0 0 1
>30 0 0 0 0 0 1
>
> >
> > > sessionInfo()
> >R version 2.4.1 (2006-12-18)
> >i386-pc-mingw32
> >
> >locale:
> >LC_COLLATE=English_United
> >Kingdom.1252;LC_CTYPE=English_United
> >Kingdom.1252;LC_MONETARY=English_United
> >Kingdom.1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252
> >
> >attached base packages:
> >[1]
> >"tools" "stats" "graphics" "grDevices"
> >"utils" "datasets" "methods" "base"
> >
> >other attached packages:
> > vsn Biobase limma
> >"1.12.0" "1.12.2" "2.9.8"
>
>
>
>Copy addresses and emails from any email account to Yahoo! Mail -
>quick, easy and free.
><http://us.rd.yahoo.com/mail/uk/taglines/default/trueswitch/*http://uk.docs.yahoo.com/trueswitch2.html>Do
>it now...
More information about the Bioconductor
mailing list