[R] Sweave'ing Danish characters
Peter Jepsen
PJ at DCE.AU.DK
Tue Jan 27 13:06:44 CET 2009
Thank you, Duncan! It works perfectly!
Best regards,
Peter.
-----Original Message-----
From: Duncan Murdoch [mailto:murdoch at stats.uwo.ca]
Sent: 27. januar 2009 13:04
To: Peter Jepsen
Cc: r-help at r-project.org
Subject: Re: [R] Sweave'ing Danish characters
On 26/01/2009 5:44 PM, Peter Jepsen wrote:
> Hi,
>
> I am writing an Sweave document and am using 'xtable' to make frequency tables of diagnoses of people undergoing cholecystectomy. Some of these diagnoses contain Danish characters ("æ", "ø", and "å"), and these characters are all garbled in the Latex document after I run Sweave. The odd thing is, everything looks absolutely right in the R console, and if I enter the same Danish characters in a new variable, the new variable produces no problems?! Therefore, I cannot offer a reproducible example, but I am hoping nonetheless that someone can point me towards a solution.
This looks like an encoding problem: there are several different
standards for encoding non-ASCII characters. All of your tools have to
agree on the encoding.
To my eye it looks as though in the first case R is writing out UTF-8,
and whatever you are using to look at your .tex file is assuming latin1
(some Windows programs say "ANSI", but I think that doesn't fully
specify the encoding: you also need a code page, which is set somewhere
in Windows control panel.)
The functions related to encodings in R are:
options(encoding="latin1") - set the default encoding
iconv(x, from="latin1", to="UTF-8") - re-encode entries, mapping each
character from one encoding to the other
Encoding(x) - display the encoding of each entry (unknown means ascii
or the native encoding for your platform)
Encoding(x) <- "latin1" - change the declared encoding, without
changing the bytes.
Duncan Murdoch
> To illustrate:
>
>> library(xtable)
>> library(Hmisc)
>> rm(list=ls())
>> load("u:/kirurgi/cholecystit/Chol_oprenset.Rdata")
>>
>> test2 <- chol$nydiag[3] # This 3rd observation contains a diagnosis with Danish characters ("Kræft i fordøjelsessystemet", meaning gastrointestinal cancer).
>>
>> print(xtable(table(test2)))
> % latex table generated in R 2.8.1 by xtable 1.5-4 package
> % Mon Jan 26 23:31:37 2009
> \begin{table}[ht]
> \begin{center}
> \begin{tabular}{rr}
> \hline
> & test2 \\
> \hline
> Kræft i fordøjelsessystemet & 1 \\ # It looks right here, but in the .tex-file it says "Kræft i fordøjelsessystemet"
> \hline
> \end{tabular}
> \end{center}
> \end{table}
>
>> print(xtable(table("Kræft i fordøjelsessystemet"))) # This, on the other hand, works like a charm.
> % latex table generated in R 2.8.1 by xtable 1.5-4 package
> % Mon Jan 26 23:36:53 2009
> \begin{table}[ht]
> \begin{center}
> \begin{tabular}{rr}
> \hline
> & V1 \\
> \hline
> Kræft i fordøjelsessystemet & 1 \\ # See, no problems here!
> \hline
> \end{tabular}
> \end{center}
> \end{table}
>
>
> I am using Windows Vista 64-bit and MikTex 2.7.
>
> Best regards,
> Peter.
>
>> sessionInfo()
> R version 2.8.1 (2008-12-22)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=Danish_Denmark.1252;LC_CTYPE=Danish_Denmark.1252;LC_MONETARY=Danish_Denmark.1252;LC_NUMERIC=C;LC_TIME=Danish_Denmark.1252
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] Hmisc_3.4-4 foreign_0.8-30 xtable_1.5-4
>
> loaded via a namespace (and not attached):
> [1] cluster_1.11.12 grid_2.8.1 lattice_0.17-20 tools_2.8.1
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list