[BioC] continued dye effects, after normalization

Jenny Drnevich drnevich at uiuc.edu
Wed Jan 10 21:02:43 CET 2007


Hi Kevin,

Thanks for your suggestion. They are spotted cDNA arrays. There are ~1300 
non-cDNA spots on the arrays, about 1000 blanks and 300 controls. Removing 
these non-cDNA genes before normalizations (either within or between) 
didn't seem to have much effect, but I did realize that my PCA plots from 
before had the non-cDNAs in them. When I remove the non-cDNA spots and do 
the PCA plot, the dye effect is no longer seen strongly in PC2 alone, but 
it is strongly seen when plotting PC2 vs. PC3. PC2 and PC3 are about 50% 
and 30% of PC1, respectively, much less than when the non-cDNA were 
included, so this is helpful. However, I'm still back to wondering if I 
should attempt another normalization, or just account for the dye effect in 
the model as a blocking variable?

Thanks,
Jenny

At 11:07 AM 1/10/2007, Kevin R. Coombes wrote:
>Hi,
>
>What kind of arrays are these?
>
>We had a similar problem with Agilent arrays, which contain a few hundred 
>positive controls that are the brightest spots in the green channel but 
>are invisibly dark in the red channel.  Any normalization in use today 
>makes it look like there is a strong dye effect if you leave the controls 
>in.  However, if you first remove the controls and then normalize, the dye 
>effect disappears.
>
>Best,
>         Kevin
>
>Jenny Drnevich wrote:
>>Hi all,
>>I've been analyzing a spotted array experiment that used a common 
>>reference with a 2X2 factorial design. There were no technical dye swaps, 
>>but half of the 6 replicates in each group had the ref in Cy3 and half 
>>had the ref in Cy5. Now that Jim has modified plotPCA to accept matrices, 
>>I was checking for any unsuspected groupings that might indicate block 
>>effects. To my surprise, the arrays were still grouping based on the 
>>reference channel, even after inverting the M-values so that the 
>>reference channel was always in the denominator! Attached is a figure 
>>with 2 PCA plots, and hopefully it is small enough to make it through; 
>>the code that created them is below.  Has anyone else noticed this, and 
>>what have you done about it? I went back and checked some other 
>>experiments that used a common reference, and they also mostly showed a 
>>continued dye grouping. A between-array scale normalization, either on 
>>the regular M-values or on inverted M-values, failed to remove the dye 
>>effect as well. I didn't try other normalizations, but instead included 
>>'ref dye' as a blocking variable. The consensus correlation from 
>>duplicateCorrelation was 0.154, which when included in the lmFit model 
>>increase the number of genes found significantly different.
>>I have been working with a physics professor and his student who have 
>>developed a different data mining algorithm, which shows these dye 
>>effects even more strongly than PCA. They are suggesting another 
>>normalization is needed to remove the ref dye effect, and they want to 
>>normalize the ref dye groups separately. Doing a separate normalization 
>>doesn't seem like a good idea to me, and I wanted to get other opinions 
>>on the dye effect, my approach, and other normalization options.
>>Thanks!
>>Jenny
>>code:
>>RG <- read.maimages(targetsb$FileName,path="D:/MA Jenny",
>>                 source="genepix.median",names=targetsb$Label,wt.fun=f)
>>RG.half <- backgroundCorrect(RG,method="half")
>>MA.half <- normalizeWithinArrays(RG.half)
>>temp <- MA.half
>>temp$M[,targetsb$Cy3=="ref"] <- -1 * temp$M[,targetsb$Cy3=="ref"]
>>layout(matrix(1:2,2,1))
>>plotPCA(MA.half$M,groups=rep(c(1,2,1,2,1,2,1,2),each=3),groupnames=c("ref 
>>G","ref R"))
>>         # PC1 divides the arrays by which channel the ref was in
>>plotPCA(temp$M,groups=rep(c(1,2,1,2,1,2,1,2),each=3),groupnames=c("ref 
>>G","ref R"))
>>         # after inverting the M-values for half the arrays, PC1 divides 
>> the arrays by one of the treatments, but
>>         # the dye effect still shows up in PC2
>>
>>MA.half.scale <- normalizeBetweenArrays(MA.half,method="scale")
>>design <- modelMatrix(targetsb,ref="ref")
>>block <- rep(c(1,2,1,2,1,2,1,2),each=3)
>>corfit <- duplicateCorrelation(MA.half.scale[RG$genes$Status=="cDNA",], 
>>design, ndups=1, block=block)
>>corfit$consensus
>>     #[1] 0.1537080
>>
>>Jenny Drnevich, Ph.D.
>>Functional Genomics Bioinformatics Specialist
>>W.M. Keck Center for Comparative and Functional Genomics
>>Roy J. Carver Biotechnology Center
>>University of Illinois, Urbana-Champaign
>>330 ERML
>>1201 W. Gregory Dr.
>>Urbana, IL 61801
>>USA
>>ph: 217-244-7355
>>fax: 217-265-5066
>>e-mail: drnevich at uiuc.edu
>>
>>------------------------------------------------------------------------
>>_______________________________________________
>>Bioconductor mailing list
>>Bioconductor at stat.math.ethz.ch
>>https://stat.ethz.ch/mailman/listinfo/bioconductor
>>Search the archives: 
>>http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>Jenny Drnevich, Ph.D.
>
>Functional Genomics Bioinformatics Specialist
>W.M. Keck Center for Comparative and Functional Genomics
>Roy J. Carver Biotechnology Center
>University of Illinois, Urbana-Champaign
>
>330 ERML
>1201 W. Gregory Dr.
>Urbana, IL 61801
>USA
>
>ph: 217-244-7355
>fax: 217-265-5066
>e-mail: drnevich at uiuc.edu



More information about the Bioconductor mailing list