[BioC] problem to compute copy number with crlmm

Tue Nov 17 12:20:57 CET 2009

Hi Robert,

I have a similar situation, regarding genotypes and CNV estimates.  I
have a small set of samples (29) that I have included with the CEPH CEU
dataset (90 samples) to increase numbers in the initial processing, and
thus hopefully robustness of the data.  Do I understand you correctly in
that you are advising to NOT include them, particularly for CNV
estimates?

In terms of identifying batch effects within the sample set, what method
that you might recommend?

I also have another dataset which, although considerably larger, has the
added confounders of:

1. having been generated in dribs and drabs over the past 3-4 years

2. multiple chips for the same patient but taken at various times from
various normal and cancerous tissues.

I cant do anything about the first problem, but as I am wanting to do
CNV analyses, was wondering what you or others might suggest when
generating or modeling the results of CNV estimates?  I am now uncertain
as to the wisdom of including the CEU "reference" set in these analyses.

Thoughts/comments?

Cheers and thanks,

al

> -----Original Message-----
> From: bioconductor-bounces at stat.math.ethz.ch 
> [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of 
> Robert Scharpf
> Sent: 12 November 2009 13:06
> To: bioconductor at stat.math.ethz.ch; soerengroettrup at uni-muenster.de
> Subject: Re: [BioC] problem to compute copy number with crlmm
> 
> 
> 
> On Nov 12, 2009, at 6:00 AM, bioconductor-request at stat.math.ethz.ch  
> wrote:
> 
> > Message: 4
> > Date: Wed, 11 Nov 2009 14:23:31 +0100 (CET)
> > From: S?ren Gr?ttrup <soerengroettrup at uni-muenster.de>
> > Subject: Re: [BioC] problem to compute copy number with crlmm
> > To: <bioconductor at stat.math.ethz.ch>
> > Message-ID:
> >        
> > 
> <permail-20091111132331f0889e8400005b79-sgroe_01 at message-id.un
> i-muenster.de
> > >
> >
> > Content-Type: text/plain; charset=iso-8859-1
> >
> > Thanks for the fast answer. I will install the new version 
> as soon as 
> > possible.
> >
> > I have 12 samples available. But some of them contain the same data
> > because I
> > copied them to get more then 10 samples. Hope that's not a problem.
> >
> > S?ren
> 
> Hi S?ren,
> 
> Yes, this would explain zeros for the within-genotype 
> variance and the  
> resulting error you observed.  CRLMM does not pre-compute the  
> parameters needed to estimate copy number because of large batch  
> effects.    The best solution would be to run these samples 
> with other  
> samples that were processed at a similar time in the same 
> lab. If this  
> is not possible, an alternative is to obtain a collection of samples  
> processed at a different time in a different lab (such as hapmap).   
> The problem with this latter approach is that apparent 
> alterations in  
> copy number estimates likely reflect batch differences.  The 
> genotype  
> estimates provided by crlmm are much more robust to batch 
> effects than  
> copy number estimates -- running crlmm on just your samples is  
> perfectly fine.  Getting reliable estimates of copy number 
> for a very  
> small number of samples is a challenging problem.
> 
> Rob
> 
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch 
> https://stat.ethz.ch/mailman/listinfo/biocondu> ctor
> Search the 
> archives: 
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>