[BioC] Normalized data in expresso and Expression Console differ
Oliver Stolpe
oliver.stolpe at fu-berlin.de
Tue Jun 23 11:25:15 CEST 2009
Hello list,
currently I use the expresso method from the Bioconductor package to
analyze Affymetrix data:
normalized <- expresso(data, bgcorrect.method = "mas",
normalize.method = "quantiles",
pmcorrect.method = "mas",
summary.method = "mas")
matrix <- log2(exprs(normalized))
As a reference I use the Expression Console by Affymetrix. My goal is
to rebuild the normalized data (and therefore the resulting boxplot)
from the Expression Console with R. I took the log2 after normalization
and correction since the Expression Console delivered relative small
values (seemed logarithmized) and the expresso data had really a big
range. Unfortunately the results differ.
Does anyone know why they differ that noticeable (different mean,
many outliers)? You may have a look at the boxplots I attached.
Even when I leave out the normalization in expresso it looks nearly
the same.
I'm glad about any suggestions.
Thanks in advance,
best regards,
Oliver
Some helpful data:
> head(matrix_expresso)
data1.cel.gz data2.cel.gz data3.cel.gz data4.cel.gz
67.16587 72.66765 73.49201 74.00240
72.03782 95.80303 97.60087 64.60356
117.65746 142.88926 138.01063 159.64211
185.33413 292.81031 232.82629 259.88629
164.88572 260.95710 243.47892 247.80303
1238.80516 1674.33256 1525.44652 1490.71100
data5.cel.gz data6.cel.gz
73.5097 67.97570
93.9136 84.26307
145.7278 124.94947
250.9573 235.76545
235.0867 251.55364
1486.8813 1523.14721
> head(matrix_expresso_log2)
data1.cel.gz data2.cel.gz data3.cel.gz data4.cel.gz
6.069657 6.183241 6.199515 6.209500
6.170683 6.581999 6.608822 6.013542
6.878449 7.158754 7.108636 7.318697
7.533985 8.193823 7.863110 8.021737
7.365323 8.027669 7.927653 7.953050
10.274734 10.709370 10.575016 10.541785
data5.cel.gz data6.cel.gz
6.199863 6.086947
6.553262 6.396829
7.187132 6.965201
7.971298 7.881209
7.877049 7.974722
10.538074 10.572840
> sessionInfo()
R version 2.9.0 (2009-04-17)
i686-redhat-linux-gnu
locale:
LC_CTYPE=de_DE at euro;LC_NUMERIC=C;LC_TIME=de_DE at euro;LC_COLLATE=de_DE at euro;LC_MONETARY=C;LC_MESSAGES=de_DE at euro;LC_PAPER=de_DE at euro;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=de_DE at euro;LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] zebrafishcdf_2.4.0 marray_1.22.0 limma_2.18.0
RdbiPgSQL_1.18.1
[5] Rdbi_1.18.0 multtest_2.0.0 class_7.2-47 MASS_7.2-47
[9] affy_1.22.0 Biobase_2.4.1
loaded via a namespace (and not attached):
[1] affyio_1.12.0 preprocessCore_1.6.0 splines_2.9.0
[4] survival_2.35-4 tools_2.9.0
-------------- next part --------------
A non-text attachment was scrubbed...
Name: boxplot_expresso_log2.png
Type: image/png
Size: 5615 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/bioconductor/attachments/20090623/55d897e4/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: boxplot_expression_console_anonym.png
Type: image/png
Size: 15068 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/bioconductor/attachments/20090623/55d897e4/attachment-0001.png>
More information about the Bioconductor
mailing list