[R] Using unicode symbol has unexpected results in levels of factor object
peter dalgaard
pdalgd at gmail.com
Thu Aug 9 11:02:49 CEST 2012
On Aug 9, 2012, at 06:53 , Wyatt, Kristin M wrote:
> Dear all,
>
> When I use a unicode symbol in the labels for a factor object, the corresponding level does not display as expected. However, using levels() on the factor returns the desired output. I noticed the discrepancy when the legend labels from a call to ggplot() did not display the desired symbol, but an explicitly built legend using the same labels did.
>
> Example (I am trying to get the less than or equal to symbol):
>
>> .df <- data.frame(afp = c(0,0,1,1), time=c(0,2,0,1), surv=c(1, 0.5, 1, 0.4))
>> afpLabels <- c("AFP \u2264 16", "AFP > 16")
>> afpStrata <- factor(.df$afp, labels=afpLabels)
>> afpStrata
> [1] AFP ? 16 AFP ? 16 AFP > 16 AFP > 16
> Levels: AFP = 16 AFP > 16
>
> The first level is reported as "AFP = 16".
>
>> levels(afpStrata)
> [1] "AFP ? 16" "AFP > 16"
>>
>
> The desired result is produced with levels().
>
>
> The code below shows this issue in context through calls to ggplot() if you don't mind loading all the libraries.
>
>> library(ggplot2)
>> library(gridExtra)
>> library(plyr)
>>
>> ggplot(.df, aes(time, surv)) + geom_step(aes(color = afpStrata), size = 1.0)
>>
>> ggplot(.df, aes(time, surv)) + geom_step(aes(color = afpStrata), size = 1.0) +
> + scale_colour_hue(breaks=afpLabels, labels=afpLabels)
>>
>
> I am running a pre-compiled version of R on Windows 7 (64-bit).
>> sessionInfo()
> R version 2.15.1 (2012-06-22)
> Platform: x86_64-pc-mingw32/x64 (64-bit)
For whatever it is worth, this works fine (both examples) under OSX Snow Leopard.
Looking at the code for print.factor, I would strongly suspect that the culprit is the line
n <- length(lev <- encodeString(levels(x), quote = ifelse(quote,
"\"", "")))
which figures since you are in a .1252 locale, not .utf8 (or UTF-8 or ...).
Over to the Windows/locale/charset experts...
--
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
More information about the R-help
mailing list