[R] how to bread while loop reading from connection with read.csv
Duncan Murdoch
murdoch.duncan at gmail.com
Mon Jan 21 17:41:46 CET 2013
On 13-01-21 10:56 AM, Collins, Stephen wrote:
> Hello,
>
> I'm trying to read a file rows at a time, so as to not read the entire file into memory. When reading the "connections" and "readLines" help, and "R help archive," it seems this should be possible with read.csv and a file connection, making use of the "nrows" argument, and checking where the "nrow()" of the new batch is zero rows.
>
>>From certain posts, it seemed that read.csv should return "character(0)" when the end of file is reached, and there are no more rows to read. Instead, I get an error there are "no lines available for input." Have I made a mistake with the file, or calling read.csv?
>
> What is the proper way to check the end-of-file condition with read.csv, such that I could break a while loop reading the data in?
>
> #example, make a test file
> con <- file("test.csv","wt")
> cat("a,b,c\n", "1,2,3\n", "4,5,6\n", "7,6,5\n", "4,3,2\n", "3,2,1\n",file=con)
> unlink(con)
I don't think this is causing your problem, but unlink() seems like the
wrong function to use here. Don't you mean close()?
>
> #show the file is valid
> con <- file("test.csv","rt")
> read.csv(con,header=T)
> unlink(con)
>
> #show that readLines ends with "character(0)", like expected
> con <- file("test.csv","rt")
> readLines(con,n=10)
> readLines(con,n=10)
> unlink(con)
>
> #show that read.csv end with error
> con <- file("test.csv","rt")
> read.csv(con,header=T,nrows=10)
> read.csv(con,header=F,nrows=10)
> unlink(con)
See the Value section of ?read.csv. In particular,
"Empty input is an error unless col.names is specified, when a 0-row
data frame is returned: similarly giving just a header line if header =
TRUE results in a 0-row data frame. Note that in either case the columns
will be logical unless colClasses was supplied."
Duncan Murdoch
More information about the R-help
mailing list