[R] Good practice for database with utf-8 string in package
Marc Girondot
m@rc_grt @end|ng |rom y@hoo@|r
Thu Sep 16 18:05:43 CEST 2021
Hello everyone,
I am a little bit stucked on the problem to include a database with
utf-8 string in a package. When I submit it to CRAN, it reports NOTES
for several Unix system and I try to find a solution (if it exists) to
not have these NOTES.
The database has references and some names have non ASCII characters.
* First I don't agree at all with the solution proposed here:
https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Encoding-issues
"First, consider carefully if you really need non-ASCIItext."
If a language has non ASCII characters, it is not just to make the
writting nicer of more complex, it is because it changes the prononciation.
* Then I try to find solution to not have these NOTES.
For example, here is a reference with utf-8 characters
> DatabaseTSD$Reference[211]
[1] Hernández-Montoya, V., Páez, V.P. & Ceballos, C.P. (2017) Effects of
temperature on sex determination and embryonic development in the
red-footed tortoise, Chelonoidis carbonarius. Chelonian Conservation and
Biology 16, 164-171.
When I convert the characters into unicode, I get indeed only ASCII
characters. Perfect.
> iconv(DatabaseTSD$Reference[211], "UTF-8", "ASCII", "Unicode")
[1] "Hern<U+00E1>ndez-Montoya, V., P<U+00E1>ez, V.P. & Ceballos, C.P.
(2017) Effects of temperature on sex determination and embryonic
development in the red-footed tortoise, Chelonoidis carbonarius.
Chelonian Conservation and Biology 16, 164-171."
Then I have no NOTES when I checked the package with database in UNIX...
but how can I print the reference back with original characters ?
Thanks a lot to point me to best practices to include databases with
non-ASCII characters and not have NOTES while submitted package to CRAN.
Marc
[[alternative HTML version deleted]]
More information about the R-help
mailing list