Extended Isolation Forest

11/06/2018
by   Sahand Hariri, et al.
0

We present an extension to the model-free anomaly detection algorithm, Isolation Forest. This extension, named Extended Isolation Forest (EIF), improves the consistency and reliability of the anomaly score produced for a given data point. We show that the standard Isolation Forest produces inconsistent scores using score maps. The score maps suffer from an artifact generated as a result of how the criteria for branching operation of the binary tree is selected. We propose two different approaches for improving the situation. First we propose transforming the data randomly before creation of each tree, which results in averaging out the bias introduced in the algorithm. Second, which is the preferred way, is to allow the slicing of the data to use hyperplanes with random slopes. This approach results in improved score maps. We show that the consistency and reliability of the algorithm is much improved using this method by looking at the variance of scores of data points distributed along constant score lines. We find no appreciable difference in the rate of convergence nor in computation time between the standard Isolation Forest and EIF, which highlights its potential as anomaly detection algorithm.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset