[BioC] How to read a subset of the .CEL files
    Henrik Bengtsson 
    hb at maths.lth.se
       
    Mon Jun 26 12:30:51 CEST 2006
    
    
  
See the affxparser package, e.g. readCelUnits(filenames,
units=c(1,600:612,45)).  At the moment,you have to take it from there
yourself.
Henrik Bengtsson
On 6/24/06, James W. MacDonald <jmacdon at med.umich.edu> wrote:
> Hi Greg,
>
> Alvord, Greg (DMS) [Contr] wrote:
> >
> >
> > Dear List -
> >
> >
> >
> > I am new to BioConductor and R, working under Windows with a gig of RAM,
> > version R-2.2.1 of R.  I have successfully read in six .CEL files and
> > created the following AffyBatch object.
> >
> >
> >
> >
> >>soy.ab
> >
> >
> > AffyBatch object
> >
> > size of arrays=1164x1164 features (63516 kb)
> >
> > cdf=Soybean (61170 affyids)
> >
> > number of samples=6
> >
> > number of genes=61170
> >
> > annotation=soybean
> >
> >
> >
> > The investigator for whom I'm working is interested in an analysis of
> > differential gene expression on a subset of affyids in this AffyBatch
> > object, specifically in 37,744 of the 61,170 affyids (indicated above)
> > that relate specifically to the soybean genome.  I have learned that the
> > relevant species of interest is labeled 'Glycine max'.  I obtained this
> > information from another source and have not (due to my ignorance) been
> > able to identify any slot in soy.ab AffyBatch object that identifies
> > this species.  Here is a table of the species on the soy.ab AffyBatch
> > object (which I obtained from another source).
> >
> >
> >
> >
> >>cbind(table(Species))
> >
> >
> >                                           [,1]
> >
> > Alfalfa mosaic virus                         3
> >
> > Bean pod mottle virus strain G-7             2
> >
> > Bean pod mottle virus strain K-Hancock1      1
> >
> > Clover yellow vein virus                     1
> >
> > Glycine max                              37744
> >
> > Heterodera glycines                       7539
> >
> > Phytophthora sojae                       15864
> >
> > S. saman                                     4
> >
> > Southern bean mosaic virus strain SBMV-S     1
> >
> > Soybean mosaic virus                         1
> >
> > Soybean mosaic virus strain G5               3
> >
> > Soybean mosaic virus strain G7               1
> >
> > Soybean mosaic virus strain N                1
> >
> > Tobacco ringspot virus                       2
> >
> > Tobacco streak virus                         3
> >
> >
> >
> >
> >
> > I want to select from the soy.ab AffyBatch object the relevant
> > information for the species 'Glycine max' only.  I have created a data
> > frame containing those Affy.ID's for species 'Glycine max', e.g.,
> >
> >
> >
> >
> >>Glycine.max.Species.AffyID.df[c(1:3,37742:37744),]
> >
> >
> >           Species                Affy.ID
> >
> > 8     Glycine max         AFFX-BioB-3_at
> >
> > 9     Glycine max         AFFX-BioB-5_at
> >
> > 10    Glycine max         AFFX-BioB-M_at
> >
> > 37749 Glycine max soybean_rRNA_838_RC_at
> >
> > 37750 Glycine max    soybean_rRNA_918_at
> >
> > 37751 Glycine max soybean_rRNA_918_RC_at
> >
> >
> >
> >
> >>dim(Glycine.max.Species.AffyID.df)
> >
> >
> > [1] 37744     2
> >
> >
> >
> > How do I extract/create an AffyBatch object containing only the
> > appropriate Affy.ID's related to the 'Glycine max' species?
>
> An AffyBatch object isn't the best for subsetting this way. Better would
> be to compute expression values using rma() or your favorite method, and
> then subset.
>
> eset <- rma(soy.ab)
> subsetted.exprset <- eset[Glycine.max.Species.AffyID.df[,2],]
>
> HTH,
>
> Jim
>
> --
> James W. MacDonald
> University of Michigan
> Affymetrix and cDNA Microarray Core
> 1500 E Medical Center Drive
> Ann Arbor MI 48109
> 734-647-5623
>
>
>
> **********************************************************
> Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
    
    
More information about the Bioconductor
mailing list