[R] Way to handle variable length and numbers of columns using read.table(...)

Gabor Grothendieck ggrothendieck at gmail.com
Tue May 5 05:44:53 CEST 2009


The last line should be as follows (as the previous post missed the
time column).
The regular expression says either start from beginning (^) and look
for a string of digits, [0-9]+, or look for digits [0-9]*, a dot [.] and two
more digits [0-9][0-9].  Each time strapply finds such a match
as.numeric is applied to it.  Thus line of input results in a numeric
vector and then we simplify those vectors by rbind'ing them together.

> strapply(L[-1], "^[0-9]+|[0-9]*[.][0-9][0-9]", as.numeric, simplify = rbind)
     [,1]   [,2]   [,3]
[1,]    1  22.33  44.55
[2,]    2  66.77  88.99
[3,]    3 222.33 344.55
[4,]    4  66.77  88.99


On Mon, May 4, 2009 at 11:04 PM, Gabor Grothendieck
<ggrothendieck at gmail.com> wrote:
> Its not clear exactly what the rules are for this but if we assume
> that numbers always end in a decimal plus two digits then
> using stapply from the gsubfn package:
>
>> Lines <- "Time Loc1 Loc2
> + 1 22.33 44.55
> + 2 66.77 88.99
> + 3 222.33344.55
> + 4 66.77 88.99"
>>
>> library(gsubfn)
>> L <- readLines(textConnection(Lines))
>> strapply(L[-1], "[0-9]*[.][0-9][0-9]", as.numeric, simplify = rbind)
>       [,1]   [,2]
> [1,]  22.33  44.55
> [2,]  66.77  88.99
> [3,] 222.33 344.55
> [4,]  66.77  88.99
>
> See http://gsubfn.googlecode.com and for regular expressions see ?regex
>
> On Mon, May 4, 2009 at 10:20 PM, Jason Rupert <jasonkrupert at yahoo.com> wrote:
>>
>> I've got read.table to successfully read in my table of three columns.  Most of the time I will have a set number of rows, but sometime that will be variable and sometimes there will be only be two variables in one row, e.g.
>>
>> Time Loc1 Loc2
>> 1 22.33 44.55
>> 2 66.77 88.99
>> 3 222.33344.55
>> 4 66.77 88.99
>>
>> Is there any way to have read.table handle (1) a variable number of rows, and (2) sometime there are only two variables as shown in Time = 3 above?
>>
>> Just curious about how to handle this, and if read.table is the right way to go about or if I should read in all the data and then try to parse it out best I can.
>>
>> Thanks again.
>>
>>> R.version
>>               _
>> platform       i386-apple-darwin8.11.1
>> arch           i386
>> os             darwin8.11.1
>> system         i386, darwin8.11.1
>> status
>> major          2
>> minor          8.0
>> year           2008
>> month          10
>> day            20
>> svn rev        46754
>> language       R
>> version.string R version 2.8.0 (2008-10-20)
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>




More information about the R-help mailing list