[BioC] Advice with RemoveBatchEffect and Rank Product package

Gordon K Smyth smyth at wehi.EDU.AU
Mon Sep 10 09:04:37 CEST 2012


Dear Osee,

No, you can't use removeBatchEffect to control for dye bias.

Can you ignore the dye effect?  Not in general, but who knows?

Your experiment seems too complex to be properly analysed using RankProd. 
For one thing, it seems clear that you have obtained multiple parts of the 
brain from the same biological replicates, meaning that your samples are 
paired by fish number.

I could explain how to analyse this experiment using limma.  However, if 
you are determined that you will use RankProd, it might be best to email 
the authors of that package for advice.

Best wishes
Gordon

---------------------------------------------
Professor Gordon K Smyth,
Bioinformatics Division,
Walter and Eliza Hall Institute of Medical Research,
1G Royal Parade, Parkville, Vic 3052, Australia.
http://www.statsci.org/smyth

On Sun, 9 Sep 2012, Osee Sanogo wrote:

> Dear Gordon,
>
> Thank you for getting back to me about my questions.
>
> My experiment is trying to identify differentially expressed genes in four
> regions of the brain in response to a stressor. I have 6 biol. replicates in
> each brain region for the control and experimental groups in each region,
> and the comparison is being done within brain region (i.e., T control vs T
> exp, D ctrl vs D exp, C ctrl vs C exp, BS ctrl vs BS exp). The sample were
> run in two-color Agilent Array.
>
> You're right that the design I sent was from the separate channel analysis,
> in which I attempted to account for array and dye effect, and then run the
> data in RankProd. Now I know that this is not right. Ok, I will use the
> single channel analysis in Limma.
>
> I still would like to run the two-channel data (ratios) in RankProd, as my
> previous experience found this useful for my dara (low replicate numbers).
>
> My questions: 1) Could I use RemoveBatchEffect to control for dye bias
> before running the two-channel data in RankProd? If yes, how should I do
> this using the RemoveBatch Effect function?
>            2) I found that about 3% of my probes have dye effect. Can I
> omit controlling for dye effect without compromising the result of my
> analysis?
>
> The data were loess/scale normalized into an expression set (Data_RP).
>
> Here is the design of the experiment
>
> FileName     Cy3 Cy5 Fish.Number Slide Brain.Part Weight Length
> 1    1T.gpr   1  -1           1     2          T     39   0.63
> 2    2T.gpr  -1   1           2     1          T     39   0.63
> 3    3T.gpr   1  -1           3     4          T     39   0.63
> 4    4T.gpr  -1   1           4     3          T     39   0.63
> 5    5T.gpr   1  -1           5     6          T     39   0.63
> 6    6T.gpr  -1   1           6     5          T     NA     NA
> 7    1D.gpr  -1   1           1     5          D     47   1.21
> 8    2D.gpr   1  -1           2     4          D     47   1.21
> 9    3D.gpr  -1   1           3     1          D     47   1.21
> 10   4D.gpr   1  -1           4     6          D     47   1.21
> 11   5D.gpr  -1   1           5     3          D     47   1.21
> 12   6D.gpr   1  -1           6     2          D     NA     NA
> 13   1C.gpr   1  -1           1     4          C     47   1.31
> 14   2C.gpr  -1   1           2     3          C     47   1.31
> 15   3C.gpr   1  -1           3     6          C     47   1.31
> 16   4C.gpr  -1   1           4     5          C     47   1.31
> 17   5C.gpr   1  -1           5     2          C     47   1.31
> 18   6C.gpr  -1   1           6     1          C     NA     NA
> 19  1BS.gpr  -1   1           1     1         BS     89   1.44
> 20  2BS.gpr   1  -1           2     2         BS     89   1.44
> 21  3BS.gpr  -1   1           3     3         BS     89   1.44
> 22  4BS.gpr   1  -1           4     4         BS     NA     NA
> 23  5BS.gpr  -1   1           5     5         BS     NA     NA
> 24  6BS.gpr   1  -1           6     6         BS     NA     NA
>
> Thank you for your help and please let me know if you need further
> explanation of the experiment.
>
> Best regards,
>
> Osee
>
>>
>
>
> On 9/9/12 7:24 PM, "Gordon K Smyth" <smyth at wehi.EDU.AU> wrote:
>
>> Dear Osee,
>>
>> You are attempting to do a number of things that don't seem correct to me.
>>
>> First, you seem to attempting a separate channel analysis of two color
>> microarray data, but ignoring the pairing of the red and green channels.
>> It isn't correct to do this.  I don't see any way to use RankProd, or any
>> other package designed for independent samples, correctly in this context.
>> If you must do a separate channel analysis, you would be better off using
>> the separate channel analysis facilities of the limma package.
>>
>> Second, when you set batch=rep(1,24), you are defining a batch that
>> consists of every array in your data set.  Obviously it doesn't make sense
>> to remove batch effects unless there are at least two batches.
>>
>> Third, I don't follow your design matrix.
>>
>> It would be better to go back to the start, and describe in more basic
>> terms what is the nature of your data and what comparison you are trying
>> to make.
>>
>> Best wishes
>> Gordon
>>
>>> Date: Sat, 8 Sep 2012 11:40:45 +0000
>>> From: "Sanogo, Yibayiri O" <sanogo at illinois.edu>
>>> To: "bioconductor at r-project.org" <bioconductor at r-project.org>
>>> Subject: [BioC] Advice with RemoveBatchEffect and Rank Product package
>>>
>>> Dear Members of the list,
>>>
>>> (I apologize for posting this again -I sent it earlier to the list but
>>> from another account and I was listed me as non-Member -and Member I am
>>> since 2008:-)).
>>>
>>> I have been using Rank Prod to analyze Agilent two-color data. However, I
>>> would like to remove the dye effect prior to analysis. I read on the forum
>>> that RemoveBatchEffect should be used in the Limma linear model, a type of
>>> modeling that is not in Rank Product.
>>>
>>> I have two questions:
>>>
>>> 1) Would it be appropriate to use RemoveBatchEffect to correct for dye
>>> effect prior to running the expression data using Rank Prod?
>>>
>>> 2) a) If no, what other function could I use to do this?
>>>   b) If yes, I would like a help with the correct design and how to
>>> properly indicate the batch.
>>>
>>> Here is my design indicating the two dyes (cy3=-1, cy5=1; T, D, C, BS =are
>>> different areas of the brain):
>>>
>>> design1
>>>   BS  C  D  T
>>> 1   0  0  0  1
>>> 2   0  0  0 -1
>>> 3   0  0  0  1
>>> 4   0  0  0 -1
>>> 5   0  0  0  1
>>> 6   0  0  0 -1
>>> 7   0  0 -1  0
>>> 8   0  0  1  0
>>> 9   0  0 -1  0
>>> 10  0  0  1  0
>>> 11  0  0 -1  0
>>> 12  0  0  1  0
>>> 13  0  1  0  0
>>> 14  0 -1  0  0
>>> 15  0  1  0  0
>>> 16  0 -1  0  0
>>> 17  0  1  0  0
>>> 18  0 -1  0  0
>>> 19 -1  0  0  0
>>> 20  1  0  0  0
>>> 21 -1  0  0  0
>>> 22  1  0  0  0
>>> 23 -1  0  0  0
>>> 24  1  0  0  0
>>>
>>> attr(,"assign")
>>> [1] 1 1 1 1
>>>
>>> I've tried this (Data_RP are my data, the M values of the expression set):
>>>
>>> DYE_RP<-removeBatchEffect(Data_RP, batch=rep(1,24), batch2=NULL,
>>> design=design1)
>>>
>>> but it is returning an error message
>>> " Error in contr.sum(levels(batch)) :
>>>  not enough degrees of freedom to define contrasts"
>>>
>>> Please help me correct this code.
>>>
>>> Thank you so much for your help.
>>>
>>> Osee
>>>
>>> -- -- --
>>> Y. Osee Sanogo
>>> Integrative Biology
>>> Institute for Genomic Biology
>>> University of Illinois at Urbana
>>> 505 S. Goodwin Ave
>>> Urbana, IL-61801
>>>
>>> Tel: 217-333 2308 (Office)
>>>     217-417 9593 (Cell)

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioconductor mailing list