[BioC] duplicate correlation on Agilent 4x44 arrays
Sean Davis
sdavis2 at mail.nih.gov
Wed Apr 11 18:56:04 CEST 2007
On Tuesday 10 April 2007 08:07, Mitch Levesque wrote:
> Gordon,
>
> Thanks for the reply. I am not using any particular instruction set, just
> what I have put together from the User Guide.
>
> You were right about the file dimensions, they are different:
> > dim(RG)
>
> [1] 44407 4
>
> > gal <- readGAL()
> > dim(gal)
>
> [1] 180880 10
>
> Is it possible to read the duplicate positions directly off of the gal
> file? I tried:
If you are thinking that the four different arrays represent "duplicates",
then that probably isn't correct. The "duplicates" in the sense of
dupCorrelation are duplicate spots with the same sample hybridized to them;
hybing the same sample multiple times on the same slide is not the typical
use case (but perhaps you did do this?)
There are not many duplicate spots on Agilent arrays unless you have an array
design where this is the case. I don't recall what you said about your array
design, but unless there are duplicates of many thousands of probes out of
the total of 44k probes within one array, using dupCorrelation is probably
not warranted.
> layout <- getLayout(gal, guessdups=TRUE)
The confusion here, I think, is in the fact that the GAL file is for the
entire slide (which includes 4 arrays). You need to not use the GAL file for
these arrays and just get the information from the Agilent FE file, which
read.maimages will load automatically with source='agilent'. If there are
other columns that you need, you can specify them directly from the
read.maimages() function--see the documentation.
Also, note that Agilent uses so-called orange-packed array designs, so the old
idea of row/column doesn't translate perfectly, as each row is offset from
the next. Also, within a given array (and on the 4x44, there are four such
arrays), there are no subarrays.
> I haven't tried without the normexp, but I will test it. Thanks again.
Agilent uses a rather sophisticated background estimation method, so I agree
with Gordon that there really isn't a need do more for these arrays. You can
read the technical manual for the platform for a full description of the
algorithm (which I would encourage).
Sean
More information about the Bioconductor
mailing list