Optimal Rank and Select Queries on Dictionary-Compressed Text
Let γ be the size of a string attractor for a string S of length n over an alphabet of size σ. By known reductions, we can build a string attractor such that γ is upper-bounded by the number r of equal-letter runs in the Burrows-Wheeler transform of S, the size z of its Lempel-Ziv 77 factorization, the sizes g and b of a grammar and a Macro Scheme generating S, or the size e of the smallest compact automaton recognizing all S's suffixes. It is known that, within O (γ^ϵ n(n/γ) / n) words of space, random access on S can be performed in optimal O((n/γ)/ n) time. In this paper we show that, within O (σγ^ϵ n(n/γ)/ n) words of space, also rank and select queries can be supported in O((n/γ)/ n) time. We provide lower bounds showing that, when σ∈ O(polylog n), these space-time upper bounds are optimal. Our structures match the lower bounds also on most dictionary compression schemes. This is the first result showing that rank and select queries on the text can be supported efficiently in a space bounded by the size of a compressed Burrows-Wheeler transform or a general Macro Scheme, and improves existing bounds for LZ77-compressed text by a n time-factor in select queries.
READ FULL TEXT