[BioC] Artifact in RMA-normalized data

Mary Putt mputt at cceb.upenn.edu
Thu Mar 4 19:15:17 MET 2004


Hi,
I normalized the data from 13 arrays (6 group H and 7 group P) using
rma. I found that the arrays from my H group were systematically lower
than from the P group at the lower end of the expression scale, while my
arrays from the H group were higher than the arrays from the P group at
the high end of the scale. The differences are subtle but they show up
in the MVA plots, as well as in the summary statistics seen below. I had
a Warning message from Affy during the normalization. It doesn't seem to
me that RMA should introduce this type of artifact--unless there's
something about the warning message that I don't understand. I'm
wondering if anyone has insights on this. Thanks Mary

########
#Program to normalize the data
############

library(affy)
load('all.Rdata')
allnorm<-expresso(alldata, bgcorrect.method='rma',
normalize.method='quantiles.robus', pmcorrect.method='pmonly',
summary.method='me
dianpolish')
exprs.allnorm<-exprs(allnorm)
save(exprs.allnorm, file='exprs.allnorm.Rdata')
#########
#Warning msg following normalization
############

#> source('expresso.all.r')
#background correction: rma
#normalization: quantiles.robus
#PM/MM correction : pmonly
#expression values: medianpolish
#background correcting...done.
#normalizing...Chip weights are  1 1 1 1 1 1 0 1 1 1 1 1 1 1
#Chip weights are  1 1 1 1 1 1 1 1 0 1 1 1 1 1
#done.
#22283 ids to be processed
#.........
#Warning messages:
#1: the condition has length > 1 and only the first element will be used
in: if (remove.extreme == "variance") {
#2: the condition has length > 1 and only the first element will be used
in: if (remove.extreme == "mean") {
#3: the condition has length > 1 and only the first element will be used
in: if (remove.extreme == "both") {
#4: the condition has length > 1 and only the first element will be used
in: if (remove.extreme == "variance") {
#5: the condition has length > 1 and only the first element will be used
in: if (remove.extreme == "mean") {
#6: the condition has length > 1 and only the first element will be used
in: if (remove.extreme == "both") {

#####################
#descriptive statistics of normalized data
#
#note that h and p are different groups
##################
>
> summary
            h1     h2     h3     h4     h5     h6     p1     p2
p3     p4
Min      2.997  3.008  3.051  3.010  2.967  3.005  3.123  3.057  3.119
3.102
1stQrtl  4.719  4.679  4.762  4.739  4.771  4.771  4.895  4.717  4.926
4.891
Median   5.924  5.901  5.950  5.970  5.970  5.961  6.015  5.942  6.015
6.018
Mean     6.165  6.143  6.150  6.171  6.182  6.178  6.167  6.163  6.172
6.162
3rdQrtl  7.291  7.300  7.266  7.358  7.316  7.288  7.201  7.281  7.216
7.224
Max     13.310 13.620 13.760 13.800 13.660 13.660 13.800 13.790 13.670
13.660
            p5     p6     p7
Min      3.121  3.017  3.041
1stQrtl  4.938  4.829  4.835
Median   6.031  5.993  6.015
Mean     6.172  6.166  6.168

>apply(summary[,1:6], 1, median)
    Min 1stQrtl  Median    Mean 3rdQrtl     Max
 3.0065  4.7505  5.9555  6.1680  7.2955 13.6600
> apply(summary[,7:13], 1, median)
    Min 1stQrtl  Median    Mean 3rdQrtl     Max
  3.102   4.891   6.015   6.167   7.220  13.660
> apply(summary[,1:6], 1, mean)
      Min   1stQrtl    Median      Mean   3rdQrtl       Max
 3.006333  4.740167  5.946000  6.164833  7.303167 13.635000
> apply(summary[,7:13], 1, mean)
      Min   1stQrtl    Median      Mean   3rdQrtl       Max
 3.082857  4.861571  6.004143  6.167143  7.227143 13.602857




--
Mary E. Putt
Assistant Professor of Biostatistics
Department of Biostatistics and Epidemiology
Center for Biostatistics and Epidemiology
School of Medicine, University of Pennsylvania,
621 Blockley Hall
423 Guardian Drive
Philadelphia, PA
19104-6021

Ph. (215) 573-7020
Fax (215) 573-4865



More information about the Bioconductor mailing list