[R] Row limit for read.table
Frank McCown
fmccown at cs.odu.edu
Wed Jan 17 18:22:40 CET 2007
> In your case, read.table behaves as documented.
> The ' - character is one of the standard quoting characters. Some (but
> very few) of the entrys contain single ' chars, so sometimes more than
> ten thousand lines are just treated as a single entry. Try using
> quote="" to disable quoting, as documented on the help page:
>
> f<-read.table("http://www.cs.odu.edu/~fmccown/R/Tchange_rates_crawled.dat",
> header=TRUE, nrows=123000, comment.char="", sep="\t",quote="")
>
> length(f$change_rate)
> [1] 122271
So either adding quote="" works or removing sep="\t" (and not using
quote) works. It seems an odd side-effect that specifying the separator
changes the default behavior of quoting (because of the ' character). I
don't see that association made in the help file.
> There is (colClasses, works as documented). Try
>
> f<-read.table("http://www.cs.odu.edu/~fmccown/R/Tchange_rates_crawled.dat",
> + header=TRUE, nrows=123000, comment.char="",
> sep="\t",quote="",colClasses=c("character","NULL","NULL","NULL","NULL"))
> > dim(f)
> [1] 122271 1
> Did you read the help page?
Of course I did. For me the definition of colClasses wasn't clear...
"A vector of classes to be assumed for the columns" didn't seem to be
the same thing as "the columns you would like to be read." I may have
made the association if the help page had contained a simple example of
using colClasses.
Thanks for the help,
Frank
--
Frank McCown
Old Dominion University
http://www.cs.odu.edu/~fmccown/
More information about the R-help
mailing list