[R] How to read only specified columns from a data file

Sarah Goslee sarah.goslee at gmail.com
Wed Mar 16 13:13:58 CET 2011


read.table() looks at the first five rows when determining how many columns
there are. If there are more columns in row 7 and you do not specify that in
the read.table() command directly, they will be wrapped to the next row.

This was discussed on the list within the last couple weeks.

Sarah

On Wed, Mar 16, 2011 at 7:54 AM, Luis Ridao <luridao at gmail.com> wrote:
> David,
>
> Thanks for your tip but it seems I'm having problems with the number
> of columns R manages to read in. Below it s an example of the data read in:
>
>> inp[1:20,]
>        V1          V2        V3       V4     V5     V6     V7     V8     V9
> 1   1.0000 log_fy_coff -1.007600 0.119520 1.0000     NA            NA     NA
> 2   2.0000 log_fy_coff -0.935010 0.112840 0.8896 1.0000            NA     NA
> 3   3.0000 log_fy_coff -0.876260 0.107500 0.8219 0.8847 1.0000     NA     NA
> 4   4.0000 log_fy_coff -0.683090 0.103030 0.7656 0.8143 0.8747 1.0000     NA
> 5   5.0000 log_fy_coff -0.623500 0.100980 0.7206 0.7636 0.8086 0.8764 1.0000
> 6   6.0000 log_fy_coff -0.583330 0.098978 0.6819 0.7214 0.7615 0.8150 0.8762
> 7   1.0000                    NA       NA     NA     NA            NA     NA
> 8   7.0000 log_fy_coff -0.676790 0.096608 0.6521 0.6892 0.7254 0.7719 0.8148
> 9   0.8717      1.0000        NA       NA     NA     NA            NA     NA
> 10  8.0000 log_fy_coff -0.696060 0.093761 0.6297 0.6654 0.6988 0.7405 0.7750
> 11  0.8116      0.8643  1.000000       NA     NA     NA            NA     NA
> 12  9.0000 log_fy_coff -0.527060 0.089949 0.6003 0.6347 0.6667 0.7060 0.7367
>
> as you see there are only 9 columns in inp and the rest is read in in
> the following row(see row 7)
> I just don't understand why this is happening (using fill=T does not
> help either)
>
> Best,
> Luis
>
> On Tue, Mar 15, 2011 at 5:15 PM, David Winsemius <dwinsemius at comcast.net> wrote:
>>
>> On Mar 15, 2011, at 1:11 PM, <rex.dwyer at syngenta.com> wrote:
>>
>>> I think you need to read an introduction to R.
>>> For starters, read.table returns its results as a value, which you are not
>>> saving.
>>> The probable answer to your question:
>>> Read the whole file with read.table, and select columns you need, e.g.:
>>> tab <- read.table(myfile, skip=2)[,1:5]
>>>
>>> -----Original Message-----
>>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
>>> On Behalf Of Luis Ridao
>>> Sent: Tuesday, March 15, 2011 11:53 AM
>>> To: r-help at r-project.org
>>> Subject: [R] How to read only specified columns from a data file
>>>
>>> R-help,
>>>
>>> I'm trying to read a data file with plenty of columns.
>>> I just need the first 5 but it doe not work by doing something like:
>>>
>>>> mycols <- rep(NULL, 430) ; mycols[c(1:4)] <- NA
>>>> read.table(myfile, skip=2, colClasses=mycols)
>>
>> I would have suggested:
>>
>> mycols <- rep(NULL, 430) ; mycols[1:5] <- rep("numeric", 5)
>> inp <- read.table(myfile, skip=2, colClasses=mycols)
>> head(inp)
>>
>> --
>> David.
>>
>>>
>>> Any suggestions?
>>>

-- 
Sarah Goslee
http://www.functionaldiversity.org



More information about the R-help mailing list