[R] problem with formula argument to randomForest
Ed Komp
komp at ittc.ku.edu
Wed Oct 28 15:25:49 CET 2015
The randomForest function generates an error whenever
I supply it with a formula using the function, I() to inhibit interpretation.
When I do so, I always get an error like this one:
Error in unique(c("AsIs", oldClass(x))) : object 'Age' not found
Is this because of:
1. a restriction for the randomForest function that I have not seen documented;
2. a deficiency / error in randomForest; or
3. an error in my calling sequence?
I am including a very simple example to demonstrate the problem.
Simply using I(<colname>) generates the error.
This is not a meaningful use of I(), but is very simple.
My Interest is for I( <col1> / <col2>) .
I also demonstrate that the usage of I() in a formula works just fine
for another discrimination function, lda.
The sample code is included after my signature, along with line-by-line output.
Thanks in advance !
Ed Komp
ITTC Lab, University of Kansas
===============
> library(rpart)
> library(MASS)
> library(randomForest)
randomForest 4.6-12
Type rfNews() to see new features/changes/bug fixes.
> formula <- as.formula('Kyphosis ~ Age + Number + Start')
> formula
Kyphosis ~ Age + Number + Start
> formulaWithI <- as.formula('Kyphosis ~ I(Age) + Number + Start')
> formulaWithI
Kyphosis ~ I(Age) + Number + Start
> fit <- randomForest(formula, data=kyphosis)
> fitWithI <- randomForest(formulaWithI, data=kyphosis)
Error in unique(c("AsIs", oldClass(x))) : object 'Age' not found
>
> fit <- lda(formula, data = kyphosis)
> fitWithI <- lda(formula, data = kyphosis)
> fitWithI
Call:
lda(formula, data = kyphosis)
Prior probabilities of groups:
absent present
0.7901235 0.2098765
Group means:
Age Number Start
absent 79.89062 3.750000 12.609375
present 97.82353 5.176471 7.294118
Coefficients of linear discriminants:
LD1
Age 0.005910971
Number 0.291501797
Start -0.170496626
>
> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11 (El Capitan)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] randomForest_4.6-12 MASS_7.3-44 rpart_4.1-10
More information about the R-help
mailing list