Anomaly Detection With Partitioning Overfitting Autoencoder Ensembles
In this paper, we propose POTATOES (Partitioning OverfiTing AuTOencoder EnSemble) a new type of autoencoder ensembles for unsupervised outlier detection. Autoencoders are a popular method for this type of problem, especially if the data is located near a submanifold of smaller dimension than that of the ambient space. The standard approach is to approximate the data with the decoder submanifold of the autoencoder and to use the reconstruction error as anomaly score. However, one of the main problems is often to find the right amount of regularization. If the regularization is too strong, the data is underfitted and we obtain many false positives. If the regularization is too weak, the data is overfitted which results in false negatives. The remedy we propose is to not regularize at all, but to rather randomly partition the data into sufficiently many equally sized parts, overfit each part with its own autoencoder, and to use the maximum over all autoencoder reconstruction errors as the anomaly score. We apply our model to realistic data and show that it outperforms current outlier detection methods.
READ FULL TEXT