[R] Isolation forest using "solitude" package: help to predict
Johan Lassen
joh@n|@@@en @end|ng |rom gm@||@com
Wed Aug 14 11:23:26 CEST 2019
Dear community,
I would like to know if someone can help clarifying how to predict anomaly
scores on new data sets using the "solitude" package. A simple model can be
trained using:
library(solitude)
# Training the model:
iris_train <- iris[1:100, ]
model <- isolation_forest(iris_train[, 1:4], seed =
100,num.trees=100,importance="none")
# The anomaly scores of a new test data set can be calculated by
iris_test <- iris[100:150, ]
predicted_anomalies <- predict(mo, iris_test[, 1:4],type="anomaly_score")
#The challenge is how to predict the anomaly scores for a data set with
less observations than the #number of observations in the training data
set.
# Example: using a subset of just 11 observations as compared to the 51
observations results in anomaly scores that are smaller:
iris_test <- iris[100:110, ]
predicted_anomalies <- predict(mo, iris_test[, 1:4],type="anomaly_score")
Anyone knows how to predict "normalised (with respect to sample size)"
anomaly scores using the solitude package for R?
Thanks in advance!
Johan
--
Johan Lassen
"In the cities people live in time -
in the mountains people live in space" (Budistisk munk).
[[alternative HTML version deleted]]
More information about the R-help
mailing list