[BioC] problem reading genepix files using both marray and limma functions

Bela Tiwari btiwari at ceh.ac.uk
Mon Aug 16 17:37:53 CEST 2004


Hello,


Last week I was sent GenePix data files from an associate. As far as
I'm aware, these files have not been edited in any way before being sent
to me.  

My aim was to load them up and run some marray and/or limma functions
on the data.

First I tried to load the files (16 of them) using read.GenePix(), but
this failed with an error:

   Error in "colnames<-"( `*tmp*`, value = fnames) :
               length of dimnames [2] not equal to array extent


Then I tried loading a file individually using read.GenePix() and that
worked fine, however, subsets of files did not.

I then read through some of the relevant Bioconductor mailing list
posts that I could find, and decided to try the read.maimages function
as an alternative.

This I did, only to get errors such as:

      line 35162 did not have 43 elements


So, I tried loading the files individually, using read.maimages() to
see if I could track down the "problem" files, and then look at them to
see if there was an issue with certain lines within those files.

I did this, and found that 5 of my 16 files would not load using
read.maimages and gave errors like the one directly above.

One file  gave a different error:

   "number of items read is not a multiple of the number of columns"

giving me a total of 6 out of 16 files that won't load using
read.maimages.

Tackling the latter error first - I looked at the file, and saw an
incomplete line at the bottom of the file. I got rid of that, and tried
to load the file using read.GenePix(). I still received a warning
message about the fact that the number of items read is not a multiple
of the number of columns.  I cannot spot the problem in the edited
version of the file. The edited file does, however, now read in without
error using read.maimages().


I then tried loading the files that "failed" with the first error
message above individually with read.GenePix() and this works.

I did look at some of the files to try and see what the problem was
(ie. whether there was anything obviously strange at the lines indicated
as problems by the read.maimages error message), but I can't see
anything.


I then took the "successful" subset of my files ( those I could read in
as individual files using read.maimages), and tried to read those in as
a group. This didn't work either, but the error I got was:

Error in "[.data.frame"(obj, , columns$Rf) :
                  undefined columns selected

So, I specified the columns explicitly in the read.maimages command,
but I still got the same error.


Thankfully, a recent posting to the mailing list 
(http://files.protsuggest.org/biocond/html/3512.html)  mentioned issues
related to this, and Dave Nelson gave a solution that could be
implemented. I did this, and my "successful" files
then read in just fine using this hacked version of read.maimages().

I also tried using the read.Genepix() function to read in just the
group of "successful" files and that gives the error:

       Error in "colnames<-"( `*tmp*`, value = fnames) :
               length of dimnames [2] not equal to array extent


So, overall, my questions are:

Is there anyone out there who would be willing to scan over one of my
"successful" files and one of my "failed" files and see if they can spot
the problem? The errors suggest that the problem should be easy to
spot...but I can't see it. Even with all the gymnastics related above, I
still have a situation where I have only managed to load about half of
the files I have.

Is there anyone else who has had these experiences of groups of GenePix
files being so seemingly inconsistent as far as being able to read them
using Bioconductor functions?  And if so, do you have any advice on how
too make life as easy as possible?

Does anyone have any other comments about the internal
workings/assumptions of functions such as read.maimages in comparison
to, say, functions like read.GenePix, and which may be more forgiving,
or have known issues, etc?


Sorry this is such a long mail!


best wishes,

Bela Tiwari

*************************
Dr. Bela Tiwari
Lead Bioinformatician

CEH Oxford
Mansfield Road
Oxford, OX1 3SR
01865 281975



More information about the Bioconductor mailing list