Graph-based Nearest Neighbor Search: From Practice to Theory
Graph-based approaches are empirically shown to be very successful for approximate nearest neighbor (ANN) search. However, there has been very little research on their theoretical guarantees. In this work, we consider both low-dimensional (d << log(n)) and high-dimensional (d >> log(n)) regimes and rigorously analyze the performance of graph-based nearest neighbor algorithms when the dataset is uniformly distributed on a d-dimensional sphere. For both regimes, we provide the conditions which guarantee that a graph-based algorithm solves the ANN problem in just one iteration. In the low-dimensional regime, we also show that it is possible to solve the exact nearest neighbor problem. Finally, we discuss how the "small-world" property affects the performance of graph-based approaches.
READ FULL TEXT