Quantile-based clustering
A new cluster analysis method, K-quantiles clustering, is introduced. K-quantiles clustering can be computed by a simple greedy algorithm in the style of the classical Lloyd's algorithm for K-means. It can be applied to large and high-dimensional datasets. It allows for within-cluster skewness and internal variable scaling based on within-cluster variation. Different versions allow for different levels of parsimony and computational efficiency. Although K-quantiles clustering is conceived as nonparametric, it can be connected to a fixed partition model of generalized asymmetric Laplace-distributions. The consistency of K-quantiles clustering is proved, and it is shown that K-quantiles clusters correspond to well separated mixture components in a nonparametric mixture. In a simulation, K-quantiles clustering is compared with a number of popular clustering methods with good results. A high-dimensional microarray dataset is clustered by K-quantiles.
READ FULL TEXT