[BioC] help needed - 7 array time course gene expression no replicates - fold change calculation
James W. MacDonald
jmacdon at uw.edu
Tue Jul 10 16:08:28 CEST 2012
Hi Liat,
On 7/10/2012 7:09 AM, Liat [guest] wrote:
> Dear All,
>
> I am new to both bioconductor and microarrays and am struggling quite a bit.
>
> I am trying to analyze data collected at 7 time points for just 1 treatment (so basically at time 0 something was added to the cells and we want to know how expression changed along time). There are no replicates.
>
> Not having replicates seems to cause quite a lot of problems. I assume I'm just not using the right packages/calls.
It depends on what assumptions you are willing to make. If you want to
assume that there is a linear response between time and gene expression
(where time is considered to be a continuous covariate rather than a
factor level), then you can fit a model using these data. You could even
allow for curvature in the line by adding a quadratic or cubic term. In
that situation you could still use limma.
However, if you are simply looking to find differences between say time
1 and time 0, then you have no replication and will have to rely only on
fold change. This is a simple matter of subtracting one column of your
data matrix from another (assuming that you have taken logs, which you
should do).
>
> I tried using the limma package, but my design matrix is actually a vector (as I only have one treatment) and that doesn't seem to work.
The design matrix will not be a single vector regardless. You are not
comparing treatment, as you have only one treatment. You are comparing
time, for which you have seven observations. Let's say you used 0, and
1-6 hours as time points. You could use a design matrix like
> time <- seq(0,6,1)
> model.matrix(~time)
(Intercept) time
1 1 0
2 1 1
3 1 2
4 1 3
5 1 4
6 1 5
7 1 6
attr(,"assign")
[1] 0 1
Where obvs, the first column is the intercept and the second column is
the time as continuous covariate. You could also add a quadratic term
> model.matrix(~time+I(time^2))
(Intercept) time I(time^2)
1 1 0 0
2 1 1 1
3 1 2 4
4 1 3 9
5 1 4 16
6 1 5 25
7 1 6 36
where you are allowing for curvature in one direction. You might want to
add a cubic term as well, which will allow for two curves, but you are
really running out of degrees of freedom at that point.
There are other things you could do as well. You could look for big
changes between time points (and relatively unchanged expression at all
other times) by aggregating time points. As an example, a given gene
might be relatively unchanged at the first three time points, then jump
up to a higher expression level and remain there for the remaining four
time points. A t-test comparing the mean of the first three points and
the remaining four time points would tease that out. You could do
several such comparisons (time 0 vs all others, time 0 and 1 vs times
2-6, etc).
Again, there are underlying assumptions for this sort of analysis, and
you are looking for a very particular pattern. It really comes down to
what sort of assumptions you are willing to make, and whether or not you
will be able to defend those assumptions to others.
Best,
Jim
>
> I would like to consider genes that show a minimum of two-fold change in expression. So (I think - again, I'm a complete newbie) I need to compare each of time points 1-6 to time point 0 and look at the difference in expression levels.
>
> How can I do that?
>
> Your help will be greatly appreciated!
> Liat.
>
> -- output of sessionInfo():
>
>
>
>
>
> --
> Sent via the guest posting facility at bioconductor.org.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099
More information about the Bioconductor
mailing list