Generalized Dirichlet-process-means for f-separable distortion measures

01/31/2019
by   Masahiro Kobayashi, et al.
0

DP-means clustering was obtained as an extension of K-means clustering. While it is implemented with a simple and efficient algorithm, it can estimate the number of clusters simultaneously. However, DP-means is specifically designed for the average distortion measure. Therefore, it is vulnerable to outliers in data, and it can cause large maximum distortion in clusters. In this work, we extend the objective function of the DP-means to f-separable distortion measures and propose a unified learning algorithm to overcome the above problems by the selection of the function f. Furthermore, the influence function of the estimated cluster center is analyzed to evaluate the robustness against outliers. We show the effectiveness of the generalized method by numerical experiments using real datasets.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset