[R] How to substitute special characters within a data frame?
Prof Brian Ripley
ripley at stats.ox.ac.uk
Fri Aug 15 12:43:36 CEST 2008
You've not told us the 'at a minimum' information requested in the posting
guide. What OS? What locale? And how did you 'import'?
But here's a guess. If you change \\345 to \345, it should render
correctly in a Latin-1 locale:
> "H\345rkan"
[1] "Hårkan"
If this a UTF-8 locale, convert it
> iconv("H\345rkan", "latin1")
[1] "Hårkan"
and if you have an unsuitable locale, e.g. a Chinese one
> iconv("H\345rkan", "latin1", "ASCII//TRANSLIT")
[1] "Harkan"
or
> gsub("\\\\345", "aa", "H\\345rkan")
[1] "Haarkan"
On Fri, 15 Aug 2008, Yingfu Xie wrote:
> Hello all,
>
> I have a data frame in R, imported from an excel file in Swedish. The
> original file contains several columns that have special characters,
> such as \?{a}, \?{o}, and so on. After import such special characters
> are represented in the data frame by "\\345", "\\366" etc (don't ask me
> why). For example, a word "H?rkan" becomes ''H\\345rkan".
That's odd: the quotes do not match.
We do need to ask you 'why', as we have nothing reproducible here.
> Now my question is if it is possible to substitute those "H\\345rkan" by
> "Haarkan" or simply "Harkan" in R, ideally by finding those "\\345" and
> then replacing.
>
> Thanks in advance,
> Yingfu
>
> [[alternative HTML version deleted]]
Please don't (as the posting guide asked). Properly encoded plain text
has a chance of working.
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list