[BioC] W:  unnormalised vs normalised distribution
    Benjamin Otto 
    b.otto at uke.uni-hamburg.de
       
    Wed Jun 21 10:02:56 CEST 2006
    
    
  
Hi Michal,
thanks for your reply again. The QC plot is usually one of the first things
I do. I have attached a jpeg file again, you will notice that only one
sample differs from all the others. Still the distribution shape in the
plots you have is not so different. Of course you can never KNOW that there
has been no systematic error on the level of amplification or hybridization.
But I'm quite sure there hasn't been. I'll skip sending the degradation
plot, but to mention it: all degradation lines are nearly perfectly
parallel. So the quality seems ok to me. Any suggestions here, what I may
have missed checking?
To mention throwing out the absent genes: have a look at the gcrma
expression distribution attached. This is really the same data eventhough
not all samples this time. Little description what have been changed:
a) The total dataset has different groups. Now this is restricted to two
groups.
b) If a gene is present in at least 30% of the samples in at least one of
the two groups then it is used otherwise skipped.
c) For the reduced set the distribution is plotted.
The attached file is the result. Funny, isn't it? Seems to support your
hypothesis of biological bimodality, or what would you say?
Regards
Benjamin
> -----Ursprüngliche Nachricht-----
> Von: Michal Okoniewski [mailto:MOkoniewski at PICR.man.ac.uk]
> Gesendet: 15 June 2006 11:42
> An: Benjamin Otto
> Betreff: RE: [BioC] unnormalised vs normalised distribution
> 
> Hi Benjamin,
> 
> Thanks for the figures. It seems (to me) that in your data could be a 
> sort of "biological" bimodality - if even MAS has two peaks....
> No idea what that effect could mean - some systematic error on the 
> level of amplification of pooling of the RNA??
> Perhaps it is something to discuss with the people who prepared the 
> arrays for you - a chat on experimental design and quality control 
> measures (to advertise my lab: have you run QC functions from 
> simpleaffy? ;) ). I'm just thinking aloud here...
> 
> The sort of bimodality I have seen in my data is more related to the 
> method
> - as I have seen two peaks with GCRMA and "nicer" (but not "normal") 
> distribution with RMA for my data. That's why I expressed my concern 
> about GCRMA in general and described my experiment with correlation of 
> "everything vs everything" where GCRMA was clearly not normal, whereas 
> other methods were.
> 
> It's a pity that the discussion on BioC list was cut in such a way 
> ("there was already a thread!")  - this (and yet another person) 
> discouraged me to write to BioC list for some time.
> 
> Thinking aloud again: perhaps the bimodality artefacts add up with 
> some specific qualities of your data?
> What about the distribution after the detection call filtering?
> Probably you've already tried some - I mean such things like filtering 
> out probesets with all A calls or selection of probesets with all P 
> calls and checking how the distributions look like...
> 
> all the best,
> Michal
> 
> 
> -----Original Message-----
> From: Benjamin Otto [mailto:b.otto at uke.uni-hamburg.de]
> Sent: 15 June 2006 09:58
> To: Michal Okoniewski; bioconductor at stat.math.ethz.ch
> Subject: RE: [BioC] unnormalised vs normalised distribution
> 
> Hi Michal,
> 
> there are two jpeg pics attached showing the corresponding 
> distributions for
> mas5 and rma. You could call it normal distribution for mas5 although 
> it seems to me that the two peaks are still visible. And the rma 
> version is indeed not so clearly bimodal like.
> 
> Benjamnin
> 
> > -----Original Message-----
> > From: Michal Okoniewski [mailto:MOkoniewski at PICR.man.ac.uk]
> > Sent: 09 June 2006 15:13
> > To: botto; bioconductor at stat.math.ethz.ch
> > Subject: RE: [BioC] unnormalised vs normalised distribution
> >
> >
> > Benjamin,
> >
> > And what do you get when you use standard RMA instead of GCRMA?
> > I have run GCRMA on my data and see similar pattern - the second 
> > peak is small, but is there. In general GCRMA seems to result in 
> > much more small values - RMA has a bit "nicer" distribution.
> >
> > Once I've run a distribution of correlation of all probesets against 
> > all.
> > For MAS and RMA I got what I expected - almost normal distribution, 
> > slightly shifted towards positive r. For GCRMA the distribution had 
> > not really normal shape (almost symmetric convex function with the 
> > maximum roughly in the same place as RMA - close to 0). From that
> time
> 
> > on, I believe that GCRMA may impose some artifacts onto data...
> >
> > Cheers,
> > Michal
> >
> > -----Original Message-----
> > From: bioconductor-bounces at stat.math.ethz.ch
> > [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of botto
> > Sent: 08 June 2006 13:24
> > To: bioconductor at stat.math.ethz.ch
> > Subject: [BioC] unnormalised vs normalised distribution
> >
> > Dear list members,
> >
> > I've been looking at the distribution plots of pm intensities and 
> > the corresponding expression values calculated by gcrma for a 
> > certain
> data
> 
> > set. Now I'm wondering what the best interpretation of these plots 
> > would be, because the former looks quite usual while the latter 
> > seems quite "unfamiliar" to me (nearly like a bimodal distribution, 
> > a jpg file should be attached). The data measured is simply the 
> > expression for certain mouse tumor tissues. Can anybody explain why 
> > after the background correction and normalisation I get this 
> > distribution
> shape?
> >
> > log(PM) density:
> >
> > |      *
> > |    *  *
> > |   *    *
> > |  *     *
> > |  *      *
> > | *        *
> > |*          *
> > |             ***********
> > +-------------------------------
> >
> > gcrma-expression values:
> >
> > |      *
> > |    *  *
> > |   *    *
> > |  *     *
> > |  *      *
> > | *        *     * * *
> > |*          ****       *
> > |                        ***
> > +----------------------------------
> >
> >
> >
> >
> >
> > --
> > Benjamin Otto
> > Universitaetsklinikum Eppendorf Hamburg Institut fuer Klinische
> Chemie
> 
> > Martinistrasse 52
> > 20246 Hamburg
> >
> > --------------------------------------------------------
> >
> >
> > This email is confidential and intended solely for the use of the
> > person(s) ('the intended recipient') to whom it was addressed.
> > Any views or opinions presented are solely those of the author and 
> > do not necessarily represent those of the Paterson Institute for 
> > Cancer Research or the University of Manchester. It may contain 
> > information that is privileged & confidential within the meaning of 
> > applicable law. Accordingly any dissemination, distribution, 
> > copying, or other use of this message, or any of its contents, by 
> > any person other than the intended recipient may constitute a breach 
> > of civil or criminal law and is strictly prohibited. If you are NOT 
> > the intended recipient please contact the sender and dispose of this 
> > e-mail as soon as possible.
> >
> >
> 
> --------------------------------------------------------
> 
> 
> This email is confidential and intended solely for the use of the
> person(s) ('the intended recipient') to whom it was addressed. Any 
> views or opinions presented are solely those of the author and do not 
> necessarily represent those of the Paterson Institute for Cancer 
> Research or the University of Manchester. It may contain information 
> that is privileged & confidential within the meaning of applicable law.
> Accordingly any dissemination, distribution, copying, or other use of 
> this message, or any of its contents, by any person other than the 
> intended recipient may constitute a breach of civil or criminal law 
> and is strictly prohibited. If you are NOT the intended recipient 
> please contact the sender and dispose of this e-mail as soon as possible.
>
    
    
More information about the Bioconductor
mailing list