[R] Chaid Decision Tree
Achim Zeileis
Achim.Zeileis at uibk.ac.at
Mon Aug 22 19:24:48 CEST 2016
On Mon, 22 Aug 2016, MIKE DE LA HOZ wrote:
>
> Hi,
>
>
> I am running a chaid tree using titanic dataset (see attachment)
>
>
>
> setwd("C:/Users/miguel")
>
> titanic <- read.csv("train.csv")
> titanic.s <- subset( titanic, select = -c(PassengerId, Name ) )
>
> ctrl <- chaid_control(minsplit = 20, minbucket = 5, minprob = 0)
> chaidTitanic <- chaid(Survived ~ ., data = titanic, control = ctrl)
>
>
>
> It looks like I get the following error
>
> Error: is.factor(x) is not TRUE
>
>
>
> can you please help me here? I am not able to follow this type of error. if you can rewrite the sentence for me, It will be much appreciated
To be able to apply the chaid() function all variables (both response and
predictor) need to be categorical variables, i.e., in R of class "factor".
It is not clear which variables are the culprits here because your example
is not reproducible. I guess that there are at least some numeric
regressor variables. Maybe the "Survived" response is also in numeric
dummy coding rather than the appropriate "factor" variable.
In any case, I would recommend to use a tree model that can deal with both
kinds of regressor variables. If you want something that selections split
variables and split points based on statistical tests, ctree() from
package "partykit" would be the obvious candidate.
>
> Thanks
>
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list