[R] Eliminate level information
darrelkj
darrelkj at mail.uc.edu
Sat Jul 9 21:38:28 CEST 2011
Hi, I hope this formatting is correct as it is my first time.
I am trying to do comparisons of values in a data frame that has some factor
variables.
One instance is
> train$sex[2]
[1] Male
Levels: Female Male
So the value is Male but a comparison like "Male" == train$sex[2]
will always return FALSE because of the level information included.
Another problem this creates is
> factor(train$workclass[25:30])
[1] Private Local-gov Private NA Private
[6] Private
Levels: Local-gov NA Private
> is.na(train$workclass[25:30])
[1] FALSE FALSE FALSE FALSE FALSE FALSE
Which they are all false because of the levels data in the comparison. This
would seem to be bug because I thought that NA was a protected keyword but
it is being used here as a level. Which will make it fail the missing value
criteria for two reasons now because it is a level.
I tried a conversion using data.matrix() but that gets rid of all factor
information and makes things worse. Is there a way to suppress 'Levels:
Female Male'.
I hope this makes, thanks.
--
View this message in context: http://r.789695.n4.nabble.com/Eliminate-level-information-tp3656643p3656643.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list