String Attractors and Infinite Words
The notion of string attractor has been introduced in [Kempa and Prezza, 2018] in the context of Data Compression and it represents a set of positions of a finite word in which all of its factors can be "attracted". The smallest size γ^* of a string attractor for a finite word is a lower bound for several repetitiveness measures associated with the most common compression schemes, including BWT-based and LZ-based compressors. The combinatorial properties of the measure γ^* have been studied in [Mantaci et al., 2021]. Very recently, a complexity measure, called string attractor profile function, has been introduced for infinite words, by evaluating γ^* on each prefix. Such a measure has been studied for automatic sequences and linearly recurrent infinite words [Schaeffer and Shallit, 2021]. In this paper, we study the relationship between such a complexity measure and other well-known combinatorial notions related to repetitiveness in the context of infinite words, such as the factor complexity and the recurrence. Furthermore, we introduce new string attractor-based complexity measures, in which the structure and the distribution of positions in a string attractor of the prefixes of infinite words are considered. We show that such measures provide a finer classification of some infinite families of words.
READ FULL TEXT