Discovering Subdimensional Motifs of Different Lengths in Large-Scale Multivariate Time Series
Detecting repeating patterns of different lengths in time series, also called variable-length motifs, has received a great amount of attention by researchers and practitioners. Despite the significant progress that has been made in recent single dimensional variable-length motif discovery work, detecting variable-length subdimensional motifs—patterns that are simultaneously occurring only in a subset of dimensions in multivariate time series—remains a difficult task. The main challenge is scalability. On the one hand, the brute-force enumeration solution, which searches for motifs of all possible lengths, is very time consuming even in single dimensional time series. On the other hand, previous work show that index-based fixed-length approximate motif discovery algorithms such as random projection are not suitable for detecting variable-length motifs due to memory requirement. In this paper, we introduce an approximate variable-length subdimensional motif discovery algorithm called Collaborative HIerarchy based Motif Enumeration (CHIME) to efficiently detect variable-length subdimensional motifs given a minimum motif length in large-scale multivariate time series. We show that the memory cost of the approach is significantly smaller than that of random projection. Moreover, the speed of the proposed algorithm is significantly faster than that of the state-of-the-art algorithms. We demonstrate that CHIME can efficiently detect meaningful variable-length subdimensional motifs in large real world multivariate time series datasets.
READ FULL TEXT