[R] How to deal with missing values when using Random Forrest
David Winsemius
dwinsemius at comcast.net
Sun Feb 26 02:26:39 CET 2012
On Feb 25, 2012, at 6:24 PM, kevin123 wrote:
> I am using the package Random Forrest to test and train a model,
> I aim to predict (LengthOfStay.days),:
>
>> library(randomForest)
>> model <- randomForest( LengthOfStay.days~.,data = training,
> + importance=TRUE,
> + keep.forest=TRUE
> + )
>
>
> *This is a small portion of the data frame: *
>
> *data(training)*
>
> LengthOfStay.days CharlsonIndex.numeric DSFS.months
> 1 0 0.0 8.5
> 6 0 0.0 3.5
> 7 0 0.0 0.5
> 8 0 0.0 0.5
> 9 0 0.0 1.5
> 11 0 1.5 NaN
>
> *Error message*
>
> Error in na.fail.default(list(LengthOfStay.days = c(0, 0, 0, 0, 0,
> 0, :
> missing values in object,
What part of that error message is unclear? Have you looked at the
randomForest page? It tells you what the default behavior is na.fail.
>
> I would greatly appreciate any help
I would seem that the way forward is to remove the cases with missing
values or to impute values.
--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
More information about the R-help
mailing list