[R] Problem related to multibyte string in CSV file
Ivan Krylov
kry|ov@r00t @end|ng |rom gm@||@com
Thu Nov 14 18:49:36 CET 2019
On Thu, 14 Nov 2019 09:34:30 -0800
Dennis Fisher <fisher using plessthan.com> wrote:
> Warning message:
> In readLines(FILE, n = 1) : line 1 appears to contain an
> embedded nul
<...>
> print(STRING)
> [1] "\xff\xfet”
Most probably, this means that the FILE is UCS-2LE-encoded (or maybe
UTF-16). Unlike UTF-8, text encoded using UCS-2LE may contain NUL bytes
if the code points in question are U+00FF and below. You should decode
it before processing it in R; one of the examples in ?readLines shows
how to do it:
# read a 'Windows Unicode' file
A <- readLines(con <- file("Unicode.txt", encoding = "UCS-2LE"))
close(con)
> Now to my question: I am trying to automate this process and I would
> like to see the output from the print command but without the [1]
> that precedes the string.
Try encodeString combined with cat or message.
--
Best regards,
Ivan
More information about the R-help
mailing list