[BioC] replicates and low expression levels

Claire Wilson ClaireWilson at picr.man.ac.uk
Mon Jun 2 12:17:26 MEST 2003


>On Fri, May 30, 2003 at 05:28:45PM +0100, Crispin Miller wrote:
> > Hi,
> > Just a quick question about low expression levels on Affy systems - I 
> hope it's not too off-topic; it is about normalisation and data analysis...
> > I've heard a lot of people advocating that it's a good idea to perform 
> an initial filtering on either Present Marginal or Absent calls, or on 
> gene-expression levels (so that only genes with an expression > 40, say, 
> after scaling to a TGT of 100 using the MAS5.0 algorithm, are part of the 
> further analysis). Firstly, am I right in thinking that this is to 
> eliminate data that are too close to the background noise level of the system.
> >
> > I wanted to canvas opinion as to whether people feel we need to do this 
> if we have replicates and are using statistical tests - rather than just 
> fold-changes - to identify 'interesting' genes. Does the statistical 
> testing do this job for us?
>
>Hi,
>   In my opinion you should always do some sort of non-specific
>   filtering. What you have described is one form of it, others include
>   removing genes that show little or no variability across samples.
>   I think of non-specific filtering as filtering without reference to
>   phenotype (of any sort).
>
>   There are a number of reasons for doing this, some motivated by the
>   biology and some by the statistics.
>
>   First off, especially for Affy, the chip is designed for all tissue
>   types but a commonly held belief is that only about 40% of the genome
>   is expressed in any specific tissue type. So, for any experiment you
>   will have a pretty large number of probes for genes that are not
>   expressed in the tissue you are looking at.
>   From a statistical perspective you need to be a little bit cautious
>   if you are going to standardize genes across samples (this is pretty
>   common). If you do not remove those genes that show little
>   variability before standardization then you have just elevated the
>   noise to the same status as the signal (and if the 40% estimate is
>   right then you actually have more noise than signal - not too
>   pleasant).

Hi,

Just to clarify a couple of points. This suggest to me that filtering of genes with low expression is required prior to normalization and I was just wondering in Bioconductor how this is achieved without the use of Present/Absent calls and following on from a later point
 
>   you have just carried out). It seems to me to be much easier to just
>   filter those genes with no expression or little variation out at the
>   very start.

what would be your filter for no expression of little variation?

Sorry if these questions are a little basic

Thanks

Claire
 
--------------------------------------------------------

 
This email is confidential and intended solely for the use of th... {{dropped}}



More information about the Bioconductor mailing list