Approximate Nearest Neighbor Searching with Non-Euclidean and Weighted Distances
We present a new approach to approximate nearest-neighbor queries in fixed dimension under a variety of non-Euclidean distances. We are given a set S of n points in ℝ^d, an approximation parameter ε > 0, and a distance function that satisfies certain smoothness and growth-rate assumptions. The objective is to preprocess S into a data structure so that for any query point q in ℝ^d, it is possible to efficiently report any point of S whose distance from q is within a factor of 1+ε of the actual closest point. Prior to this work, the most efficient data structures for approximate nearest-neighbor searching in spaces of constant dimensionality applied only to the Euclidean metric. This paper overcomes this limitation through a method called convexification. For admissible distance functions, the proposed data structures answer queries in logarithmic time using O(n log (1 / ε) / ε^d/2) space, nearly matching the best known bounds for the Euclidean metric. These results apply to both convex scaling distance functions (including the Mahalanobis distance and weighted Minkowski metrics) and Bregman divergences (including the Kullback-Leibler divergence and the Itakura-Saito distance).
READ FULL TEXT