[R] read.table -unexpected truncation of rows

Dawn Field dfield at molbiol.ox.ac.uk
Tue Aug 21 18:55:02 CEST 2001


Hello,

Are there any 'unallowed' (meta)characters that can cause R to parse 
a file incorrectly when using read.table?
--Dawn




Here the background if it's needed:

I'm new to R, but have done some Perl.  I'm trying to read in a 
data.frame from a file using * as a record separator.

data <- read.table("all_genomes.data",header=T,sep="*")

This has been working great until now when I tried to parse my latest 
(biggest) file.  The file has been generated with the exact same 
script as other files that have imported correctly, but with 
different original data.
R reads the new file and complains that the rows aren't all the same length.

Here's some truncated output:
[409] 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11
[433] 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11
[457]  4 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11

...

[1201] 11 11 11 11 11 11
   Error in read.table("all_genomes.data", header = T, as.is = TRUE, 
sep = "*") :
           all rows must have the same length.
   Execution halted

In trying to figure out why it's reading four fields on line 457 
instead of the proper 11, I've
* looked manually at the lines that are too short..(they have 11)
* imported the file into excel to make sure the record separator is 
working as I expected The file imports perfectly into Excel using 
this separator with 11 fields for all entries.
* looked for funny metacharacters that could be confusing the parser


I still think there might be some characters in my original data that 
I parsed to make the data.frame in "all_genomes.data" that read.table 
doesn't like (my data is not all numeric), but I can't find them.

I've been looking at help(read.table) and tried to go through the 
archive and FAQ for some advice, but haven't found any yet.

Any suggestions would be greatly appreciated. I think R is great and 
I'm looking for all the tips I can get.
Dawn



-- 


Dawn Field
Molecular Evolution and Bioinformatics
CEH, Oxford
Mansfield Road
Oxford
OX1 3SR

Tel: *-44-(0)1865-281659 (direct)
Tel: *-44-(0)1865-281630 (reception to leave a message)
Fax: *-44-(0)1865-281696
http://www.nox.ac.uk/
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list