[R] input string ... cannot be translated to UTF-8, is it valid in 'ANSI_X3.4-1968'?
Duncan Murdoch
murdoch@dunc@n @end|ng |rom gm@||@com
Fri Apr 23 05:07:30 CEST 2021
On 22/04/2021 9:25 p.m., Spencer Graves wrote:
> Hello:
>
>
> What if anything should I do regarding notes from either "load" or
> "attach" that, "input string ... cannot be translated to UTF-8, is it
> valid in 'ANSI_X3.4-1968'?"?
First, ANSI_X3.4-1968 is an official name for for a version of Ascii.
It appears in the file near the start, where I believe it records the
native encoding in place when the file was written, so readers using a
different encoding can translate.
Your actual file appears to have been encoded in UTF-8, but not marked
as such. You're lucky you read it on macOS, where UTF-8 is the native
encoding, since the reader probably recognized the bytes weren't ascii
bytes (and warned you about that), then just left them alone. If you
read that file on Windows you'd likely get junk for those entries.
For your interest, here's a dump of the start of your file, after
gunzipping it:
00000000 52 44 58 33 0a 58 0a 00 00 00 03 00 03 06 00 00
|RDX3.X..........|
00000010 03 05 00 00 00 00 0e 41 4e 53 49 5f 58 33 2e 34
|.......ANSI_X3.4|
00000020 2d 31 39 36 38 00 00 04 02 00 00 00 01 00 04 00
|-1968...........|
00000030 09 00 00 00 01 78 00 00 03 13 00 00 00 10 00 00
|.....x..........|
00000040 02 0e 00 00 02 6e 40 90 0c 00 00 00 00 00 40 90
|.....n using .......@.|
00000050 44 00 00 00 00 00 40 10 00 00 00 00 00 00 40 7c
|D..... using .......@||
Duncan Murdoch
>
>
> I'm running R 4.0.5 under macOS 11.2.3; see "sessionInfo()" and
> detailed instructions below on the precise file I dowloaded from the web
> and tried to read.
>
>
> I may be able to get what I want just ignoring this. However, I'd
> like to know how to fix this.
>
>
> Thanks,
> Spencer Graves
>
>
> sessionInfo()
> R version 4.0.5 (2021-03-31)
> Platform: x86_64-apple-darwin17.0 (64-bit)
> Running under: macOS Big Sur 10.16
>
> Matrix products: default
> LAPACK:
> /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> loaded via a namespace (and not attached):
> [1] compiler_4.0.5 htmltools_0.5.1.1 tools_4.0.5 yaml_2.2.1
>
> [5] tinytex_0.31 rmarkdown_2.7 knitr_1.31
> digest_0.6.27
> [9] xfun_0.22 rlang_0.4.10 evaluate_0.14
> > search()
> [1] ".GlobalEnv" "file:NAVCO 1.3 List.RData"
> [3] "file:NAVCO 1.3 List.RData" "tools:rstudio"
> [5] "package:stats" "package:graphics"
> [7] "package:grDevices" "package:utils"
> [9] "package:datasets" "package:methods"
> [11] "Autoloads" "package:base"
>
>
> *** To get the file I used for this, I went to
> "https://www.ericachenoweth.com/research". From there I clicked
> "Version 1.3". This took me to
>
>
> https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/ON9XND
>
>
> I then clicked the "Download" icon to the right of "NAVCO 1.3 List.tab".
> This gave me 5 "Download Options", one of which was "RData Format"; I
> selected that. This downloaded "NAVCO 1.3 List.RData", which I moved to
> getwd(). Then I did 'load("NAVCO 1.3 List.RData")' and 'attach("NAVCO
> 1.3 List.RData")'. Both of those gave me 8 repetitions of a message
> like "input string ... cannot be translated to UTF-8, is it valid in
> 'ANSI_X3.4-1968'?" with different values substituted for "...".
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list