Substring Complexities on Run-length Compressed Strings

05/25/2022
by   Akiyoshi Kawamoto, et al.
0

Let S_T(k) denote the set of distinct substrings of length k in a string T, then the k-th substring complexity is defined by its cardinality |S_T(k)|. Recently, δ = max{ |S_T(k)| / k : k ≥ 1 } is shown to be a good compressibility measure of highly-repetitive strings. In this paper, given T of length n in the run-length compressed form of size r, we show that δ can be computed in 𝐶_𝗌𝗈𝗋𝗍(r, n) time and O(r) space, where 𝐶_𝗌𝗈𝗋𝗍(r, n) = O(min (r r, r _r n)) is the time complexity for sorting r O( n)-bit integers in O(r) space in the Word-RAM model with word size Ω( n).

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset