[R] read.table: deciding automatically between two colClasses values
Oliver Kullmann
O.Kullmann at swansea.ac.uk
Sun Aug 28 17:50:16 CEST 2011
Hi Josh,
thanks, that worked!
For the record, here is a function to determine the
number of strings, space-separated, in the first line
of a file:
# Removes leading and trailing whitespaces from string x:
trim = function(x) gsub("^\\s+|\\s+$", "", x)
# The number of strings in the first line in the file with name f:
lengthfirstline = function(f) {
length(unlist(strsplit(trim(readLines(f,1)), " ")))
}
Oliver
On Sun, Aug 28, 2011 at 07:23:07AM -0700, Joshua Wiley wrote:
> Hi Oliver,
>
> Look at ?readLines
>
> I imagine something like:
>
> tmp <- readLines(filename, n = 1L)
> (do stuff with the first line to decide)
> IntN <- 6 (or 4)
> NumN <- 8 (or whatever)
> E <- read.table(file = filename, header = TRUE, colClasses =
> c(rep("integer", IntN), "numeric", "integer", rep("numeric", NumN)), ...)
>
> Cheers,
>
> Josh
>
> On Sun, Aug 28, 2011 at 7:13 AM, Oliver Kullmann
> <O.Kullmann at swansea.ac.uk> wrote:
> > Hello,
> >
> > I have a function for reading a data-frame from a file, which contains
> >
> > E = read.table(file = filename,
> > header = T,
> > colClasses = c(rep("integer",6),"numeric","integer",rep("numeric",8)),
> > ...)
> >
> > Now a small variation arose, where
> >
> > colClasses = c(rep("integer",4),"numeric","integer",rep("numeric",8))
> >
> > needed to be used (so just a small change).
> > I want to have it convenient for the user, so no user intervention shall
> > be needed, but the function should choose between the two different values
> > "4" and "6" here according to the header-line.
> >
> > Now this seems to be a problem: I found only count.fields, which
> > however is not able just to read the first line. Reading the
> > whole file (just to read the first line) is awkward, and also these
> > files typically have millions of lines. The only possibility to influence
> > count.fields seems via skip, but this I could only use to skip to the
> > last line, which reads the file nevertheless, and I also don't know
> > the number of lines in the file.
> >
> > Perhaps one could catch an error, when the first invocation of
> > read.table fails, and try the second one. However tryCatch doesn't
> > seem to make it simple to write something like
> >
> > E = try(expr1 otherwise expr2)
> >
> > (if expr1 fails, evaluate expr2 instead) ?
> >
> > Oliver
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
More information about the R-help
mailing list