Achieving anonymity via weak lower bound constraints for k-median and k-means
We study k-clustering problems with lower bounds, including k-median and k-means clustering with lower bounds. In addition to the point set P and the number of centers k, a k-clustering problem with (uniform) lower bounds gets a number B. The solution space is restricted to clusterings where every cluster has at least B points. We demonstrate how to approximate k-median with lower bounds via a reduction to facility location with lower bounds, for which O(1)-approximation algorithms are known. Then we propose a new constrained clustering problem with lower bounds where we allow points to be assigned multiple times (to different centers). This means that for every point, the clustering specifies a set of centers to which it is assigned. We call this clustering with weak lower bounds. We give an 8-approximation for k-median clustering with weak lower bounds and an O(1)-approximation for k-means with weak lower bounds. We conclude by showing that at a constant increase in the approximation factor, we can restrict the number of assignments of every point to 2 (or, if we allow fractional assignments, to 1+ϵ). This also leads to the first bicritera approximation algorithm for k-means with (standard) lower bounds where bicriteria is interpreted in the sense that the lower bounds are violated by a constant factor. All algorithms in this paper run in time that is polynomial in n and k (and d for the Euclidean variants considered).
READ FULL TEXT