[R] Problem with ONE of the Special German Characters
Duncan Murdoch
murdoch at stats.uwo.ca
Thu Apr 15 19:32:48 CEST 2010
On 15/04/2010 12:22 PM, Michael Stegh wrote:
> Dear List,
>
> I have data which contain the special German characters "ä", "ö", "ü" etc. After reading the
> text files into R those characters are displayed strangely, e. g. "ä" is "ä". The first step is to
> replace those with their typical transcription, e. g. "ä" becomes "ae" by using the gsub
> command.
>
Your example of "ä" is what you would see if you stored it in UTF-8
encoding, then read it in Latin1. So I suspect you need to declare the
encoding of the files you are reading before reading them. You can do
this as follows:
con <- file("foo.txt", encoding="UTF-8", open="r")
readLines(con)
close(con)
By default, R assumes the encoding of files matches the default encoding
on your system.
> Until I upgraded to version 2.10.1 (from 2.8.0) this worked perfectly for all characters. Now it
> works for all characters but "Ü".
>
> temp1<-gsub("Ãoe","Ue",temp1)
>
You might want to try perl=TRUE in the gsub() call; it seems to handle
strange characters in regular expressions better than the default TRE
library does.
Duncan Murdoch
> This letter is displayed as "Ãoe" (as before), but R is no longer able to find this character. The
> problem seems to be linked to the "oe" part, since I could substitute for "Ã" without a problem.
> Strangely if I get the two characters by extracting them with the substr command to a variable
> and then using the variable I am able to substitute without a problem. Any ideas, what I am
> missing?
>
> Thanks,
>
> Michael
>
> [[alternative HTML version deleted]]
>
>
> ------------------------------------------------------------------------
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list