[R] input string ... cannot be translated to UTF-8, is it valid in 'ANSI_X3.4-1968'?
John Kane
jrkr|de@u @end|ng |rom gm@||@com
Sun Apr 25 19:46:55 CEST 2021
The tab format seems to read in with no problem.
On Thu, 22 Apr 2021 at 23:08, Duncan Murdoch <murdoch.duncan using gmail.com> wrote:
>
> On 22/04/2021 9:25 p.m., Spencer Graves wrote:
> > Hello:
> >
> >
> > What if anything should I do regarding notes from either "load" or
> > "attach" that, "input string ... cannot be translated to UTF-8, is it
> > valid in 'ANSI_X3.4-1968'?"?
>
> First, ANSI_X3.4-1968 is an official name for for a version of Ascii.
> It appears in the file near the start, where I believe it records the
> native encoding in place when the file was written, so readers using a
> different encoding can translate.
>
> Your actual file appears to have been encoded in UTF-8, but not marked
> as such. You're lucky you read it on macOS, where UTF-8 is the native
> encoding, since the reader probably recognized the bytes weren't ascii
> bytes (and warned you about that), then just left them alone. If you
> read that file on Windows you'd likely get junk for those entries.
>
> For your interest, here's a dump of the start of your file, after
> gunzipping it:
>
> 00000000 52 44 58 33 0a 58 0a 00 00 00 03 00 03 06 00 00
> |RDX3.X..........|
> 00000010 03 05 00 00 00 00 0e 41 4e 53 49 5f 58 33 2e 34
> |.......ANSI_X3.4|
> 00000020 2d 31 39 36 38 00 00 04 02 00 00 00 01 00 04 00
> |-1968...........|
> 00000030 09 00 00 00 01 78 00 00 03 13 00 00 00 10 00 00
> |.....x..........|
> 00000040 02 0e 00 00 02 6e 40 90 0c 00 00 00 00 00 40 90
> |.....n using .......@.|
> 00000050 44 00 00 00 00 00 40 10 00 00 00 00 00 00 40 7c
> |D..... using .......@||
>
> Duncan Murdoch
>
> >
> >
> > I'm running R 4.0.5 under macOS 11.2.3; see "sessionInfo()" and
> > detailed instructions below on the precise file I dowloaded from the web
> > and tried to read.
> >
> >
> > I may be able to get what I want just ignoring this. However, I'd
> > like to know how to fix this.
> >
> >
> > Thanks,
> > Spencer Graves
> >
> >
> > sessionInfo()
> > R version 4.0.5 (2021-03-31)
> > Platform: x86_64-apple-darwin17.0 (64-bit)
> > Running under: macOS Big Sur 10.16
> >
> > Matrix products: default
> > LAPACK:
> > /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
> >
> > locale:
> > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
> >
> > attached base packages:
> > [1] stats graphics grDevices utils datasets methods base
> >
> > loaded via a namespace (and not attached):
> > [1] compiler_4.0.5 htmltools_0.5.1.1 tools_4.0.5 yaml_2.2.1
> >
> > [5] tinytex_0.31 rmarkdown_2.7 knitr_1.31
> > digest_0.6.27
> > [9] xfun_0.22 rlang_0.4.10 evaluate_0.14
> > > search()
> > [1] ".GlobalEnv" "file:NAVCO 1.3 List.RData"
> > [3] "file:NAVCO 1.3 List.RData" "tools:rstudio"
> > [5] "package:stats" "package:graphics"
> > [7] "package:grDevices" "package:utils"
> > [9] "package:datasets" "package:methods"
> > [11] "Autoloads" "package:base"
> >
> >
> > *** To get the file I used for this, I went to
> > "https://www.ericachenoweth.com/research". From there I clicked
> > "Version 1.3". This took me to
> >
> >
> > https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/ON9XND
> >
> >
> > I then clicked the "Download" icon to the right of "NAVCO 1.3 List.tab".
> > This gave me 5 "Download Options", one of which was "RData Format"; I
> > selected that. This downloaded "NAVCO 1.3 List.RData", which I moved to
> > getwd(). Then I did 'load("NAVCO 1.3 List.RData")' and 'attach("NAVCO
> > 1.3 List.RData")'. Both of those gave me 8 repetitions of a message
> > like "input string ... cannot be translated to UTF-8, is it valid in
> > 'ANSI_X3.4-1968'?" with different values substituted for "...".
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
John Kane
Kingston ON Canada
More information about the R-help
mailing list