[R] Problem with reading data from an UTF-16 database

Duncan Murdoch murdoch.duncan at gmail.com
Wed Mar 14 20:45:28 CET 2018


On 14/03/2018 7:15 AM, Thierry Onkelinx wrote:
> Dear all,
> 
> We have a problem with reading some characters correctly from an UTF-16
> encoded database. The code below givens the correct characters on Ubuntu
> with the_driver = {ODBC Driver 13 for SQL Server}. On Windows (with
> the_driver = {SQL Server}), special characters like '♂' and '♀' are
> returned as '?'. I've added the sessionInfo() output from both machines.
> 
> Any suggestions on how to fix the problem?

I haven't tried it, but the RODBC package includes an argument 
DBMSencoding in the odbcDriverConnect function.  So maybe you could use 
that instead of DBI and odbc.

Duncan Murdoch
> 
> Best regards,
> 
> Thierry
> 
> library(DBI)
> con <- dbConnect(odbc::odbc(), .connection_string =
> "Driver=the_drive;Server=our_server;Database=the_database;Trusted_Connection=Yes;")
> dbGetQuery(con, sql_statement)
> 
> 
> R version 3.4.2 (2017-09-28)
> Platform: i386-w64-mingw32/i386 (32-bit)
> Running under: Windows 7 x64 (build 7601) Service Pack 1
> 
> Matrix products: default
> 
> locale:
> [1] LC_COLLATE=Dutch_Belgium.1252  LC_CTYPE=Dutch_Belgium.1252
> LC_MONETARY=Dutch_Belgium.1252 LC_NUMERIC=C
> [5] LC_TIME=Dutch_Belgium.1252
> 
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
> 
> other attached packages:
> [1] RODBC_1.3-15 DBI_0.8
> 
> loaded via a namespace (and not attached):
>   [1] Rcpp_0.12.14    R6_2.2.2        odbc_1.1.5      magrittr_1.5
> pillar_1.1.0    rlang_0.2.0     testthat_2.0.0
>   [8] blob_1.1.0      drat_0.1.3      fortunes_1.5-4  tools_3.4.2
>   bit64_0.9-7     bit_1.1-12      hms_0.4.0
> [15] yaml_2.1.14     compiler_3.4.2  pkgconfig_2.0.1 tibble_1.4.2
> 
> 
> R version 3.4.3 (2017-11-30)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Ubuntu 16.04.4 LTS
> 
> Matrix products: default
> BLAS: /usr/lib/libblas/libblas.so.3.6.0
> LAPACK: /usr/lib/lapack/liblapack.so.3.6.0
> 
> locale:
>   [1] LC_CTYPE=nl_BE.UTF-8       LC_NUMERIC=C
>   LC_TIME=nl_BE.UTF-8        LC_COLLATE=nl_BE.UTF-8
>   [5] LC_MONETARY=nl_BE.UTF-8    LC_MESSAGES=nl_BE.UTF-8
> LC_PAPER=nl_BE.UTF-8       LC_NAME=C
>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
>   LC_MEASUREMENT=nl_BE.UTF-8 LC_IDENTIFICATION=C
> 
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
> 
> other attached packages:
> [1] DBI_0.7
> 
> loaded via a namespace (and not attached):
>   [1] drat_0.1.3     bit_1.1-12     odbc_1.1.2     compiler_3.4.3 hms_0.3
>      tools_3.4.3    pillar_1.2.1   tibble_1.4.2
>   [9] yaml_2.1.17    Rcpp_0.12.14   bit64_0.9-7    blob_1.1.0
>   rlang_0.2.0    fortunes_1.5-4
> 
> 
> 
> ir. Thierry Onkelinx
> Statisticus / Statistician
> 
> Vlaamse Overheid / Government of Flanders
> INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND
> FOREST
> Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
> thierry.onkelinx at inbo.be
> Havenlaan 88 bus 73, 1000 Brussel
> www.inbo.be
> 
> ///////////////////////////////////////////////////////////////////////////////////////////
> To call in the statistician after the experiment is done may be no more
> than asking him to perform a post-mortem examination: he may be able to say
> what the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does not
> ensure that a reasonable answer can be extracted from a given body of data.
> ~ John Tukey
> ///////////////////////////////////////////////////////////////////////////////////////////
> 
> <https://www.inbo.be>
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list