[R] Isolation forest using "solitude" package: help to predict

Wed Aug 14 11:23:26 CEST 2019

Dear community,

I would like to know if someone can help clarifying how to predict anomaly
scores on new data sets using the "solitude" package. A simple model can be
trained using:

library(solitude)
# Training the model:
iris_train <- iris[1:100, ]
model <- isolation_forest(iris_train[, 1:4], seed =
100,num.trees=100,importance="none")

# The anomaly scores of a new test data set can be calculated by
iris_test <- iris[100:150, ]
predicted_anomalies <- predict(mo, iris_test[, 1:4],type="anomaly_score")

#The challenge is how to predict the anomaly scores for a data set with
less observations than the #number of observations in the training data
set.
# Example: using a subset of just 11 observations as compared to the 51
observations results in anomaly scores that are smaller:

iris_test <- iris[100:110, ]
predicted_anomalies <- predict(mo, iris_test[, 1:4],type="anomaly_score")

Anyone knows how to predict "normalised (with respect to sample size)"
anomaly scores using the solitude package for R?

Thanks in advance!
Johan

-- 
Johan Lassen

"In the cities people live in time -
in the mountains people live in space" (Budistisk munk).

	[[alternative HTML version deleted]]