[R] rpart: Writing values of the leaves to a dateset
Terry Therneau
therneau at mayo.edu
Mon Apr 12 15:50:38 CEST 2010
-- begin inclusion --
I'm fitting a regression tree with rpart and I want to write the values
for every leaf in a dataset. As an example take the variable turnover.
Let's suppose my tree for turnover has 30 leaves and I want to have 30
datasets with dataset 1 containing the turnover values of the units in
leaf 1, dataset 2 containing turnover values for the observations in
leaf
2 and so on. How can I do this?
-- end inclusion --
fit <- rpart(y ~ .......,data=mydata)
parts <- tapply(mydata$y, predict(fit), c)
Then parts will be a list with one element per branch of the tree, each
containing the values of y found in that branch.
An alternative is
indices <- tapply(1:nrow(y), predict(fit), c)
which will give a list containing row numbers.
Terry T.
More information about the R-help
mailing list