Strong Consistency for a Class of Adaptive Clustering Procedures
We introduce a class of clustering procedures which includes k-means and k-medians, as well as variants of these where the domain of the cluster centers can be chosen adaptively (for example, k-medoids) and where the number of cluster centers can be chosen adaptively (for example, according to the elbow method). In the non-parametric setting and assuming only the finiteness of certain moments, we show that all clustering procedures in this class are strongly consistent under IID samples. Our method of proof is to directly study the continuity of various deterministic maps associated with these clustering procedures, and to show that strong consistency simply descends from analogous strong consistency of the empirical measures. In the adaptive setting, our work provides a strong consistency result that is the first of its kind. In the non-adaptive setting, our work strengthens Pollard's classical result by dispensing with various unnecessary technical hypotheses, by upgrading the particular notion of strong consistency, and by using the same methods to prove further limit theorems.
READ FULL TEXT