[R] How to read malformed csv files with read.table?
jim holtman
jholtman at gmail.com
Fri Aug 22 17:02:22 CEST 2008
Try this. It will read the file and see if there is a difference and
add in the extra headers:
x <- " time/ms C550.KMS Cyt_b559.KMS Cyt_b563.KMS
Cyt_f_.KMS P515FR.KMS Scatt.KMS Zea2.KMS PC
P700
0 Point1 -599.500 0.000 0.000
0.000 0.000 0.000 0.000 0.000
0.000 0.000
0 Point2 -598.000 -0.012 -0.013
0.040 0.013 0.027 0.010 0.022
0.000 0.000
0 Point3 -596.500 -0.015 -0.015
0.044 0.020 0.025 0.010 0.033
0.000 0.000"
# find out how many dummy headers you have to add
x.c <- count.fields(textConnection(x))
x.diff <- x.c[2] - x.c[1] # assume first line is short
x.connection <- textConnection(x) # setup connection
if (x.diff > 0){
# read first line
x.first <- readLines(x.connection, n=1)
# add dummy headers
x.first <- paste(x.first, paste(LETTERS[1:x.diff], collapse=" "))
pushBack(x.first, x.connection) # push back the line so it is
ready for read.table
}
input <- read.table(x.connection, header=TRUE)
closeAllConnections()
On Fri, Aug 22, 2008 at 10:19 AM, Martin Ballaschk
<tmp082008 at ballaschk.com> wrote:
> Hi,
>
> how do I read files that have two header fields less than they have columns?
> The easiest solution would be to insert one or two additional header fields,
> but I have a lot of files and that would be quite a lot of awful work.
>
> Any ideas on how to solve that problem?
>
> #######
> R stuff:
>
>> read.table("myfile.CSV", sep = "\t", header = T)
> Error in read.table("myfile.CSV", sep = "\t", :
> more columns than column names
>
>> count.fields("myfile.CSV", sep = "\t")
> [1] 10 12 12 12 12 12 12 12 12 12 12 [...]
>
> #######
> ugly sample ("Exported by SDL DataTable component"):
>
> time/ms C550.KMS Cyt_b559.KMS Cyt_b563.KMS Cyt_f_.KMS
> P515FR.KMS Scatt.KMS Zea2.KMS PC P700
> 0 Point1 -599.500 0.000 0.000 0.000
> 0.000 0.000 0.000 0.000
> 0.000 0.000
> 0 Point2 -598.000 -0.012 -0.013 0.040
> 0.013 0.027 0.010 0.022
> 0.000 0.000
> 0 Point3 -596.500 -0.015 -0.015 0.044
> 0.020 0.025 0.010 0.033
> 0.000 0.000
> [...]
>
>
> Cheers,
> Martin
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Jim Holtman
Cincinnati, OH
+1 513 646 9390
What is the problem that you are trying to solve?
More information about the R-help
mailing list