Grammar-compressed Self-index with Lyndon Words
We introduce a new class of straight-line programs (SLPs), named the Lyndon SLP, inspired by the Lyndon trees (Barcelo, 1990). Based on this SLP, we propose a self-index data structure of O(g) words of space that can be built from a string T in O(n + g g) time, retrieving the starting positions of all occurrences of a pattern P of length m in O(m + m n + occ g) time, where n is the length of T, g is the size of the Lyndon SLP for T, and occ is the number of occurrences of P in T.
READ FULL TEXT