[R] any r package can handle factor levels not in the test set

Richard M. Heiberger rmh at temple.edu
Tue Jan 13 03:08:46 CET 2015


You need to define the levels of the training set to include all
levels that you might see.
Something like this

> A <- factor(letters[1:5])
> B <- factor(letters[c(1,3,5,7,9)])
> A
[1] a b c d e
Levels: a b c d e
> B
[1] a c e g i
Levels: a c e g i
> training <- factor(A, levels=unique(c(levels(A), levels(B))))
> training
[1] a b c d e
Levels: a b c d e g i
>

In the future please "provide commented, minimal, self-contained,
reproducible code."

On Mon, Jan 12, 2015 at 9:00 PM, HelponR <suncertain at gmail.com> wrote:
> It looks like gbm, glm all has this issue
>
> I wonder if any R package is immune of this?
>
> In reality, it is very normal that test data has data unseen in training
> data. It looks like I have to give up R?
>
> Thanks!
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list