[R] Discretization of numeric attributes
Hans W. Borchers
borchers at decrc.abb.de
Tue May 7 10:20:25 CEST 2002
Thanks for the time you are taking for this.
>Think about what the top level split in a tree does. You could also
>extract the C routines used.
That's what I didn't want to do. Mayby it's worth to write and to implement
these routines myself in S.
>You are misusing the terms though: C4.5 is not a splitting rule but a
>tree-construction and pruning algorithm, and MDL is a principle to choose
>complexity.
If you have a look into the the well-known overview article "Supervised and
unsupervised discretization of continuous features" by Dougherty, Kohavi
and Sahani, you will see that people have used the approach in C4.5 to
extract and evaluate the discretization procedure there. That's what I
meant with "C45".
The MDL appraoch for discretization as used by Fayyad and Irani or
Kononenko (see the RELIEFF algorithm) you can find realized in the public
domain WEKA data mining toolkit. Something similar is what I need.
Very best, Hans Werner Borchers.
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list