[BioC] DESeq2 multiple factors nested design

Michael Love michaelisaiahlove at gmail.com
Fri Mar 28 01:55:36 CET 2014


sorry, the last two comparisons should instead start with:

ctrst <- numeric(length(resultsNames(dds)))
names(ctrst) <- resultsNames(dds)

I will try to work up a simpler interface for these comparisons.

Mike

On Thu, Mar 27, 2014 at 8:10 PM, Michael Love
<michaelisaiahlove at gmail.com> wrote:
> hi Isobel,
>
> On Thu, Mar 27, 2014 at 1:41 PM, Isobel [guest] <guest at bioconductor.org> wrote:
>> My problem is in finding a way of accounting for clone within the model in DESeq. In an ideal world I would like to incorporate “clone” as a factor in the model, to account for differences between samples resulting from “clone”, so in the first instance I’d like to compare the following:
>>
>> ~ clone + species + diet + species:diet
>> ~ clone + species + diet
>>
>> However, I think because “clone” is nested within “species” so any given level of clone is only present within one biotype, simply putting clone into the design formula of DESeq2 does not work,
>
>
> Note that, because the clones are entirely within a single species,
> the species effect is simply a linear combination of the clone
> effects. I'll show this with an example using normal data and a linear
> model:
>
> R> species <- factor(rep(1:2,each=4))
> R> clone <- factor(rep(1:4,each=2))
> R> cbind(species, clone)
>      species clone
> [1,]       1     1
> [2,]       1     1
> [3,]       1     2
> [4,]       1     2
> [5,]       2     3
> [6,]       2     3
> [7,]       2     4
> [8,]       2     4
>
> R> set.seed(1)
> R> y <- rnorm(8)
> R> coef(lm(y ~ species))
> (Intercept)    species2
>  0.07921043  0.10448786
>
> The species effect is 0.10448786, which we get by only including species.
>
> If we run the model with only clone:
>
> R> res <- coef(lm(y ~ clone))
> R> names(res)
> [1] "(Intercept)" "clone2"      "clone3"      "clone4"
>
> The species effect is the average of clone 3 and clone 4 effect minus
> the average of clone 2 and clone 1 effect (clone 1 is absorbed by the
> intercept in this case, so 0):
>
> R> unname(.5*res["clone3"] + .5*res["clone4"] - (.5*res["clone2"]))
> [1] 0.1044879
>
>
>
>> I am interested in genes differentially expressed between diet A and diet B
>> in genes differentially expressed between species,
>> and in particular I am interested in identifying genes where differential expression between diets varies in relation to species (i.e. the diet-species interaction).
>
> So then, getting back to how to accomplish these comparisons in
> DESeq2. I'd like to give advice using DESeq2 version >= 1.3, as this
> is simpler in the new version, and it will be released very soon (on
> April 14). If you can try out this code with the development branch,
> that would be great, if not, just email me and I can formulate this
> for version 1.2.
>
> For DESeq2 version >= 1.3, I would specify:
>
> design(dds) <- ~ clone + diet + clone:diet
>
> Then the following comparisons:
>
> comparison of diet B over diet A:
>
> results(dds, contrast=c("diet","B","A"))
>
> comparison of species 2 over species 1, this is the average of the
> clone effects for each species, so:
>
> ctrst <- numeric(resultsNames(dds))
> sp2clones <- paste0("clone",dds$clone[dds$species == "2"])
> sp1clones <- paste0("clone",dds$clone[dds$species == "1"])
> ctrst[sp2clones] <- 1/5
> ctrst[sp1clones] <- -1/5
> results(dds, contast=ctrst)
>
> for the interaction between species 1 and diet, we pull out the
> interaction effects and average for each clone:
>
> ctrst <- numeric(resultsNames(dds))
> sp1dietA <- paste0("clone",dds$clone[dds$species == "1"],".dietA")
> sp1dietB <- paste0("clone",dds$clone[dds$species == "1"],".dietB")
> ctrst[sp1dietB] <- 1/5
> ctrst[sp1dietA] <- -1/5
> results(dds, contast=ctrst)
>
>
> Mike



More information about the Bioconductor mailing list