[BioC] Normalized data in expresso and Expression Console differ
cstrato
cstrato at aon.at
Tue Jun 23 18:35:46 CEST 2009
Dear Oliver
Please note that Expression Console scales the mean expression levels to
a pre-defined target intensity, thus you need to scale your data
accordingly, or use function mas5(..., sc=500) from package affy.
Furthermore, the MAS5 algorithm from Affymetrix does not use quantile
normalization.
Regarding the apparent outliers, to my knowledge there exist four
different implementations of the MAS5 algorithm, i.e. GCOS, APT, affy
and xps, which all result in slightly different expression levels, as
you can e.g. see in Figure 4 of vignette APTvsXPS.pdf from package xps, see:
http://www.bioconductor.org/packages/release/bioc/vignettes/xps/inst/doc/APTvsXPS.pdf
I must admit, that I do not know why the different implementations
differ slightly.
Best regards
Christian
_._._._._._._._._._._._._._._._._._
C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a
V.i.e.n.n.a A.u.s.t.r.i.a
e.m.a.i.l: cstrato at aon.at
_._._._._._._._._._._._._._._._._._
Oliver Stolpe wrote:
> Hello list,
>
> currently I use the expresso method from the Bioconductor package to
> analyze Affymetrix data:
>
> normalized <- expresso(data, bgcorrect.method = "mas",
> normalize.method = "quantiles",
> pmcorrect.method = "mas",
> summary.method = "mas")
> matrix <- log2(exprs(normalized))
>
> As a reference I use the Expression Console by Affymetrix. My goal is
> to rebuild the normalized data (and therefore the resulting boxplot)
> from the Expression Console with R. I took the log2 after normalization
> and correction since the Expression Console delivered relative small
> values (seemed logarithmized) and the expresso data had really a big
> range. Unfortunately the results differ.
>
>
> Does anyone know why they differ that noticeable (different mean,
> many outliers)? You may have a look at the boxplots I attached.
>
>
> Even when I leave out the normalization in expresso it looks nearly
> the same.
>
> I'm glad about any suggestions.
>
> Thanks in advance,
> best regards,
> Oliver
>
> Some helpful data:
>
> > head(matrix_expresso)
> data1.cel.gz data2.cel.gz data3.cel.gz data4.cel.gz
> 67.16587 72.66765 73.49201 74.00240
> 72.03782 95.80303 97.60087 64.60356
> 117.65746 142.88926 138.01063 159.64211
> 185.33413 292.81031 232.82629 259.88629
> 164.88572 260.95710 243.47892 247.80303
> 1238.80516 1674.33256 1525.44652 1490.71100
> data5.cel.gz data6.cel.gz
> 73.5097 67.97570
> 93.9136 84.26307
> 145.7278 124.94947
> 250.9573 235.76545
> 235.0867 251.55364
> 1486.8813 1523.14721
>
> > head(matrix_expresso_log2)
> data1.cel.gz data2.cel.gz data3.cel.gz data4.cel.gz
> 6.069657 6.183241 6.199515 6.209500
> 6.170683 6.581999 6.608822 6.013542
> 6.878449 7.158754 7.108636 7.318697
> 7.533985 8.193823 7.863110 8.021737
> 7.365323 8.027669 7.927653 7.953050
> 10.274734 10.709370 10.575016 10.541785
> data5.cel.gz data6.cel.gz
> 6.199863 6.086947
> 6.553262 6.396829
> 7.187132 6.965201
> 7.971298 7.881209
> 7.877049 7.974722
> 10.538074 10.572840
>
> > sessionInfo()
> R version 2.9.0 (2009-04-17)
> i686-redhat-linux-gnu
>
> locale:
> LC_CTYPE=de_DE at euro;LC_NUMERIC=C;LC_TIME=de_DE at euro;LC_COLLATE=de_DE at euro;LC_MONETARY=C;LC_MESSAGES=de_DE at euro;LC_PAPER=de_DE at euro;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=de_DE at euro;LC_IDENTIFICATION=C
>
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] zebrafishcdf_2.4.0 marray_1.22.0 limma_2.18.0
> RdbiPgSQL_1.18.1
> [5] Rdbi_1.18.0 multtest_2.0.0 class_7.2-47 MASS_7.2-47
> [9] affy_1.22.0 Biobase_2.4.1
>
> loaded via a namespace (and not attached):
> [1] affyio_1.12.0 preprocessCore_1.6.0 splines_2.9.0
> [4] survival_2.35-4 tools_2.9.0
>
>
>
> ------------------------------------------------------------------------
>
>
> ------------------------------------------------------------------------
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list