[R] Can scan() detect end-of-file?

Sarah Goslee sarah.goslee at gmail.com
Thu Oct 15 22:44:45 CEST 2015


I've always used system("wc -l myfile") to get the number of lines in
advance. But here are two other R-only options, both using readLines
instead of scan. There's probably something more efficient, too.

Your setup:
t <- 'A "Two line\nentry"\n\n"Three\nline\nentry" D E\n'
tfile <- tempfile()
cat(t, file=tfile)
tcon <- file(tfile, "r") # or tcon <- textConnection(t)

readLines() produces character(0) for nonexistent lines and "" for empty lines.

> readLines(tcon, n=1)
[1] "A \"Two line"
> readLines(tcon, n=1)
[1] "entry\""
> readLines(tcon, n=1)
[1] ""
> readLines(tcon, n=1)
[1] "\"Three"
> readLines(tcon, n=1)
[1] "line"
> readLines(tcon, n=1)
[1] "entry\" D E"
> readLines(tcon, n=1)
character(0)
> readLines(tcon, n=1)
character(0)

Or if the file isn't too large for memory, you can read the whole
thing in then process it line by line:

tcon <- file(tfile, "r") # or tcon <- textConnection(t)
allfile <- readLines(tcon, n=10000)

> length(allfile)
[1] 6

On Thu, Oct 15, 2015 at 4:16 PM, William Dunlap <wdunlap at tibco.com> wrote:
> I would like to read a connection line by line with scan but
> don't know how to tell when to quit trying.  Is there any
> way that you can ask the connection object if it is at the end?
>
> E.g.,
>
> t <- 'A "Two line\nentry"\n\n"Three\nline\nentry" D E\n'
> tfile <- tempfile()
> cat(t, file=tfile)
> tcon <- file(tfile, "r") # or tcon <- textConnection(t)
> scan(tcon, what="", nlines=1)
> #Read 2 items
> #[1] "A"               "Two line\nentry"
>> scan(tcon, what="", nlines=1)  # empty line
> #Read 0 items
> #character(0)
> scan(tcon, what="", nlines=1)
> #Read 3 items
> #[1] "Three\nline\nentry" "D"                  "E"
> scan(tcon, what="", nlines=1) # end of file
> #Read 0 items
> #character(0)
> scan(tcon, what="", nlines=1) # end of file
> #Read 0 items
> #character(0)
>
> I am reading virtual line by virtual line because the lines
> may have different numbers of fields.
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
-- 
Sarah Goslee
http://www.functionaldiversity.org



More information about the R-help mailing list