Online List Labeling: Breaking the log^2n Barrier
The online list labeling problem is an algorithmic primitive with a large literature of upper bounds, lower bounds, and applications. The goal is to store a dynamically-changing set of n items in an array of m slots, while maintaining the invariant that the items appear in sorted order, and while minimizing the relabeling cost, defined to be the number of items that are moved per insertion/deletion. For the linear regime, where m = (1 + Θ(1)) n, an upper bound of O(log^2 n) on the relabeling cost has been known since 1981. A lower bound of Ω(log^2 n) is known for deterministic algorithms and for so-called smooth algorithms, but the best general lower bound remains Ω(log n). The central open question in the field is whether O(log^2 n) is optimal for all algorithms. In this paper, we give a randomized data structure that achieves an expected relabeling cost of O(log^3/2 n) per operation. More generally, if m = (1 + ε) n for ε = O(1), the expected relabeling cost becomes O(ε^-1log^3/2 n). Our solution is history independent, meaning that the state of the data structure is independent of the order in which items are inserted/deleted. For history-independent data structures, we also prove a matching lower bound: for all ϵ between 1 / n^1/3 and some sufficiently small positive constant, the optimal expected cost for history-independent list-labeling solutions is Θ(ε^-1log^3/2 n).
READ FULL TEXT