HOLZ: High-Order Entropy Encoding of Lempel-Ziv Factor Distances
We propose a new representation of the offsets of the Lempel-Ziv (LZ) factorization based on the co-lexicographic order of the processed prefixes. The selected offsets tend to approach the k-th order empirical entropy. Our evaluations show that this choice of offsets is superior to the rightmost LZ parsing and the bit-optimal LZ parsing on datasets with small high-order entropy.
READ FULL TEXT