[BioC] Discrepancy in differential expression gene list after intraspot correlation

Gordon K Smyth smyth at wehi.EDU.AU
Sun Oct 17 01:28:48 CEST 2010


Dear Priyanka,

Please keep questions on the list, I've cc'd this reply to Bioconductor.

There is no "error".  You simply have an warning message.  Since the 
warning relates to only a couple of genes, and the intraspot correlation 
is estimated robustly across all the genes, it will have no deleterious 
effect on your analysis.

Regarding your question about DE genes.  When you normalize the data 
properly, you will naturally tend to get more DE genes than if you don't, 
because more of the technical variation has been removed, so the residual 
standard errors are reduced.

Best wishes
Gordon


On Fri, 15 Oct 2010, Kachroo, Priyanka wrote:

> Dear Dr.Smyth
>
> I appreciate you taking time out to reply to my query and apologize for 
> inadequate information in my posting.
>
> My old analysis was done using single channel analysis as well, the 
> difference is that i did not use between array normalization method 
> "aquantile" to perform the analysis.

> In the recent analysis the code was the same except that i used 
> aquantile normalization across the array, and this is when i get the 
> warning message("In remlscore(y, X, Z) : reml: Max iterations exceeded") 
> and the list of differentially expressed genes changes. From the old 
> analysis with cutff of logfc 0.58 and pvalue< 0.05 i obtained 22 DE 
> genes, with my recent analysis i obtain 104 DE genes.
>
> I have attached text with code for new and old analysis.
> Experimental design is as follows:
>
> SlideNumber	FileName	Cy3	Cy5	Identity
> 13961538	13961538.gpr	RT30	RT0	RAO103
> 13961556	13961556.gpr	RT30	RT0	RAO61
> 13961428	13961428-2.gpr	RT30	RT0	RAO40
> 13961652	13961652.gpr	RT0	RT30	RAO123
> 13961645	13961645.gpr	RT0	RT30	RAO306
> 13961646	13961646.gpr	RT0	RT30	RAO312
> 13961649	13961649.gpr	CT30	CT0	Con32
> 13961635	13961635.gpr	CT30	CT0	Con53
> 13961637	13961637.gpr	CT30	CT0	Con68
> 13961661	13961661.gpr	CT0	CT30	Con87
> 13961800	13961800.gpr	CT0	CT30	Con106
>
>
>
> Direct 2-color hybridizations were done between 5 control animals (2 
> time points) and separately 6 hybridizations were done between time 0 
> and 30 RAO(disease group). I needed to compare RAO t30 versus control 
> t30 and similarly RAO t0 versus control t0 and hence decided to perform 
> single channel analysis. So my question is what is causing this error 
> and discrepancy in the results.
>
> I appreciate your help in this regard.
>
> Priyanka Kachroo
> Graduate Assistant Research
> Texas A&M University
>
> ----- Original Message -----
> From: "Gordon K Smyth" <smyth at wehi.EDU.AU>
> To: "Priyanka Kachroo" <priya_coll at neo.tamu.edu>
> Cc: "Bioconductor mailing list" <bioconductor at stat.math.ethz.ch>
> Sent: Thursday, October 14, 2010 5:06:14 PM GMT -06:00 US/Canada Central
> Subject: [BioC] Discrepancy in differential expression gene list after intraspot correlation
>
> Dear Priyanka,
>
> Sorry this is causing you problems, but there's not much we can do to help
> you based on the information you give.
>
> You say that you did an analysis a few months back, and another analysis
> more recently, and got somewhat different lists of genes.  But you don't
> tell us what the two analyses were or how they were different, except that
> at least one of them is a single channel approach.  It's not much to go
> on!  There has been no change to the limma code in the meantime, so it
> much be your analysis that has changed.
>
> I'm guessing that your first analysis might have been a standard log-ratio
> analysis.  Of course, this will give somewhat different results to a
> single-channel analysis, especially when there is very little differential
> expression to start with.  The single-channel analysis will generally
> detect more differential expression.  When the effects are strong, the
> same genes will appear in both analyses with much the same fold changes,
> but if the effects are very weak, the single-channel analysis will pick up
> more effects and the gene order may change quite a bit.
>
> If you still need advice, it would be best to at least give your targets
> frame so that we can see what your experimental design is.  But if the
> issue is simply that log-ratio and single-channel analyses can give
> different results, there's not much mystery about that.
>
> Best wishes
> Gordon
>
>> Date: Wed, 13 Oct 2010 10:48:12 -0500 (CDT)
>> From: "Kachroo, Priyanka" <priya_coll at neo.tamu.edu>
>> To: "." <bioconductor at stat.math.ethz.ch>
>> Subject: [BioC] Discrepancy in differential expression gene list after
>> 	intraspotCorrelation
>>
>> Dear BioC
>>
>> I am trying to perform a single channel analysis and previously ( a
>> couple of months back) did not get a warning message with 10 microarray
>> slides ( 6 diseases + 4 controls) and now when i try to re-run the
>> analysis i get the message "In remlscore(y, X, Z) : reml: Max iterations
>> exceeded" after i do this step :
>> corfit<-intraspotCorrelation(MA.Aq,design)
>>
>> I read in the previous posts that we can ignore some warning messages,
>> however if i had 30 differentially expressed genes (DE genes) in the old
>> analysis i now get 120 genes. There is not much overlap between the
>> results. And this is what concerns me.
>>
>> Experiment design
>>
>> Control ( 2 time points: Ct0 and Ct10) and Disease ( 2 time points: Dt0 and Dt10).
>>
>> I hybridized Ct0 and Ct10 ( 4 samples)  and Dt0 and Dt 10( 6 samples).
>> Now i would like to get DE genes for Dt0 versus ct0 and also Dt10 versus
>> Ct10. And so i resorted to single channel analysis. Can limma experts
>> please help me in this regard. I am a graduate student and am really
>> struggling with this problem.
>>
>>
>> Best Regards
>> Priyanka
>> Graduate Assistant Research
>> Texas A&M University
>
> ______________________________________________________________________
> The information in this email is confidential and inte...{{dropped:10}}



More information about the Bioconductor mailing list