Random projections and sampling algorithms for clustering of high-dimensional polygonal curves
We study the center and median clustering problems for high-dimensional polygonal curves with finite but unbounded complexity. We tackle the computational issue that arises from the high number of dimensions by defining a Johnson-Lindenstrauss projection for polygonal curves. We analyze the resulting error in terms of the Fréchet distance, which is a natural dissimilarity measure for curves. Our algorithms for the median clustering achieve sublinear dependency on the number of input curves via subsampling. For the center clustering we utilize Buchin et al. (2019a) algorithm that achieves linear running-time in the number of input curves. We evaluate our results empirically utilizing a fast, CUDA-parallelized variant of the Alt and Godau algorithm for the Fréchet distance. Our experiments show that our clustering algorithms have fast and accurate practical implementations that yield meaningful results on real world data from various physical domains.
READ FULL TEXT