[R] partykit ctree: minbucket and case weights
Amber Dawn Nolder
a.d.nolder at iup.edu
Wed May 28 23:16:12 CEST 2014
Hello,
I am an R novice, and I am using the "partykit" package to create
regression trees. I used the following to generate the trees:
ctree(y~x1+x2+x3+x4,data=my_data,control=ctree_control(testtype =
"Bonferroni", mincriterion = 0.90, minsplit = 12, minbucket = 4,
majority = TRUE)
I thought that "minbucket" set the minimum value for the sum of weights
in each terminal node, and that each case weight is 1, unless otherwise
specified. In which case, the sum of case weights in a node should equal the
number of cases (n) in that node. However, I sometimes obtain a tree with
a terminal node that contains fewer than 4 cases.
My data set has a total of 36 cases. The dependent and all independent
variables are continuous data. Variables x1 and x2 contain missing (NA)
values.
Could someone please explain why I am getting these results?
Am I mistaken about the value of case weights or about the use of minbucket
to restrict the size of a terminal node?
This is an example of the output:
Model formula:
y ~ x1 + x2 + x3 + x4
Fitted party:
[1] root
| [2] x4 <= 30: 0.927 (n = 17, err = 1.1)
| [3] x4 > 30
| | [4] x2 <= 43: 0.472 (n = 8, err = 0.4)
| | [5] x2 > 43
| | | [6] x3 <= 0.4: 0.282 (n = 3, err = 0.0)
| | | [7] x3 > 0.4: 0.020 (n = 8, err = 0.0)
Number of inner nodes: 3
Number of terminal nodes: 4
Many thanks!
Amber Nolder
Graduate Student
Indiana University of Pennsylvania
More information about the R-help
mailing list