[R] End of line marker?
Duncan Murdoch
murdoch at stats.uwo.ca
Fri Mar 5 05:55:30 CET 2010
On 04/03/2010 11:40 PM, David Winsemius wrote:
> On Mar 4, 2010, at 10:58 PM, Duncan Murdoch wrote:
>
>> On 04/03/2010 10:32 PM, David Winsemius wrote:
>>> On Mar 4, 2010, at 9:47 PM, jonas garcia wrote:
>>>> When I opened the file with a hex-editor, the problematic
>>>> character turned out to be “1a”
>>>> I am attaching a sample DAT file with 3 lines (the second line is
>>>> the one with the undesirable character).
>>>>
>>>> The furthest I could get was through readBin:
>>>>
>>>>> tmp<- readBin("new.dat", what = "raw", n=100000000)
>>>> [1] 30 32 3a 33 35 3a 33 32 2c 20 34 34 30 33 2c 20 33 37 2e 31
>>>> 31 34 2c 2d 32 30 2e 38 33 36 2c 31
>>>> [33] 35 35 2e 39 2c 30 30 2e 37 36 2c 31 31 35 36 0d 0a 30 32 3a
>>>> 33 35 3a 33 35 2c 20 34 34 33 32 2c
>>>> [65] 20 33 37 2e 31 31 34 2c 2d 32 30 2e 38 33 36 2c 31 35 35 2e
>>>> 38 2c 1a 30 2e 38 31 2c 31 31 35 37
>>>> [97] 0d 0a 30 32 3a 33 35 3a 33 39 2c 20 34 34 36 37 2c 20 33 37
>>>> 2e 31 31 34 2c 2d 32 30 2e 38 33 36
>>>> [129] 2c 31 35 35 2e 38 2c 30 30 2e 38 31 2c 31 31 35 38
>>>>
>>>>
>>>>> tmp[87]
>>>> [1] 1a
>>> I got a different "interpretation" of that character when I let R
>>> look at it. And I cannot figure out why \032 should be causing
>>> problems??? :
>> Hex 1a and octal 032 both correspond to Ctrl-Z, which is the MSDOS
>> EOF marker. I forget whether R's text reading routines pay
>> attention to that, or whether it's the C runtime, but it makes sense
>> that it would cause problems on Windows.
>>
>> Duncan Murdoch
>
> Thanks. I was interpreting \032 as decimal, so couldn't figure out why
> it should equal 0x1A. You've explained the basis (or base) of my
> confusion.
By the way, here's one way to remove the bad char. Read it using
readBin as above, then
tmp <- tmp[tmp != 0x1a]
to remove the bad chars, or
tmp[tmp == 0x1a] <- charToRaw(" ")
to replace them with spaces. Then write the tmp vector out to a file
with writeBin.
Duncan Murdoch
More information about the R-help
mailing list