Clustering through Feature Space Sequence Discovery and Analysis

12/02/2022
by   Shi Guobin, et al.
0

Identifying high-dimensional data patterns without a priori knowledge is an important task of data science. This paper proposes a simple and efficient noparametric algorithm: Data Convert to Sequence Analysis, DCSA, which dynamically explore each point in the feature space without repetition, and a Directed Hamilton Path will be found. Based on the change point analysis theory, The sequence corresponding to the path is cut into several fragments to achieve clustering. The experiments on real-world datasets from different fields with dimensions ranging from 4 to 20531 confirm that the method in this work is robust and has visual interpretability in result analysis.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset