How Far Should We Look Back to Achieve Effective Real-Time Time-Series Anomaly Detection?
Anomaly detection is the process of identifying unexpected events or ab-normalities in data, and it has been applied in many different areas such as system monitoring, fraud detection, healthcare, intrusion detection, etc. Providing real-time, lightweight, and proactive anomaly detection for time series with neither human intervention nor domain knowledge could be highly valuable since it reduces human effort and enables appropriate countermeasures to be undertaken before a disastrous event occurs. To our knowledge, RePAD (Real-time Proactive Anomaly Detection algorithm) is a generic approach with all above-mentioned features. To achieve real-time and lightweight detection, RePAD utilizes Long Short-Term Memory (LSTM) to detect whether or not each upcoming data point is anomalous based on short-term historical data points. However, it is unclear that how different amounts of historical data points affect the performance of RePAD. Therefore, in this paper, we investigate the impact of different amounts of historical data on RePAD by introducing a set of performance metrics that cover novel detection accuracy measures, time efficiency, readiness, and resource consumption, etc. Empirical experiments based on real-world time series datasets are conducted to evaluate RePAD in different scenarios, and the experimental results are presented and discussed.
READ FULL TEXT