[R] Finicky factor comparison operators
David Winsemius
dwinsemius at comcast.net
Mon Feb 20 14:57:07 CET 2012
On Feb 20, 2012, at 1:45 AM, johnmark wrote:
> MIchael -
>
> Thanks for your insight. I think I see where you're going with this.
>
> To make '==' comparisons for subsetting against an ordered factor,
> I've had
> to create a lookup table for all possible values I'd ever want to
> compare
> against (all dates covered by the quarters in question, in this
> case) that
> maps into the ordered factors values. This is wrapped by a function
> that
> returns an ordered factor, which allows me to write:
>
> /(opps$close_quarter == which.quarter.end("2010-10-20")/
>
> Otherwise if I try to create an ordered factor from the constant
> just for
> the purposes of comparison, the error tells me that ordered factors
> from
> different sources cannot be compared:
>
> /(opps$close_quarter == factor("2007-10-20", ordered=T)
> Error in Ops.factor(factor("2007-10-30", ordered = T),
> quarter.factors[1,
> 2]) :
> level sets of factors are different/
Actually it is telling you that you cannot compare ordered factors
which have different levels. That makes perfect sense for the same
reasons that you are not allowed to compare Dates to ordered factors.
If the factors from different sources had the same levels you should
have succeeded.
> z <- factor(LETTERS[3:1], ordered = TRUE)
> z3 <- factor(LETTERS[1:3] , ordered=TRUE)
> z[2] == z3[2]
[1] TRUE
>
> That makes sense, since internally factors are integers -- "enums"
> in other
> terms.
>
> But what I want to avoid -- and what I don't see as necessary is
> explicitly
> coercing the terms to a common representation that mimics their
> print form:
>
> /as.character("2007-10-20")== as.character(factor("2007-10-20",
> ordered=T))
> /
> I don't think there should be confusion since the conversion to
> print form
> is "obvious" -- but it does conflict with the conversion rules for
> creating
> vectors by c():
>
> /c("2011-10-20", factor("2007-10-20", ordered=T))
> [1] "2011-10-20" "1" /
>
> where the factor is converted to its internal "enum" representation,
> then to
> a character.
That just an example of the need to use as.character when converting
data out of factor class.
>
> Having given this some more thought to what motivated the original
> question,
> one could use "which()" to invert the factor's levels vector:
>
> /which("2008-04-30" == levels(quarter.factors[,2]))
> [1] 3 /
>
> Its still not clear to me what exactly are the implicit conversion
> rules for
> factors.
In your last case you are comparing a character to a character value
and getting the expected result. (Since levels(quarter.factors) is NOT
a factor.) You should also succeed when testing equality between
ordered factor and character types. You have still not provided an
example for testing so this may suffice.
> z <- factor(LETTERS[3:1], ordered = TRUE)
> z == "A"
[1] FALSE FALSE TRUE
You should be able to assemble a list of valid candidate (character)
values with levels(fac). Or if you want them in factor representation
then use unique(fac).
--
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list