[R] randomForest and missing data

Bálint Czúcz czucz at botanika.hu
Tue Jan 9 22:30:48 CET 2007


There is an improved version of the original random forest algorithm
available in the "party" package (you can find some additional
information on the details here:
http://www.stat.uni-muenchen.de/sfb386/papers/dsp/paper490.pdf ).

I do not know whether it yields a solution to your problem about
missing data, but maybe it's a check worth...

Best regards:

Bálint

On 1/4/07, Darin A. England <england at cs.umn.edu> wrote:
>
> Does anyone know a reason why, in principle, a call to randomForest
> cannot accept a data frame with missing predictor values? If each
> individual tree is built using CART, then it seems like this
> should be possible. (I understand that one may impute missing values
> using rfImpute or some other method, but I would like to avoid doing
> that.)
>
> If this functionality were available, then when the trees are being
> constructed and when subsequent data are put through the forest, one
> would also specify an argument for the use of surrogate rules, just
> like in rpart.
>
> I realize this question is very specific to randomForest, as opposed
> to R in general, but any comments are appreciated. I suppose I am
> looking for someone to say "It's not appropriate, and here's why
> ..." or "Good idea. Please implement and post your code."
>
> Thanks,
>
> Darin England, Senior Scientist
> Ingenix
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list