Space-Efficient Huffman Codes Revisited
Canonical Huffman code is an optimal prefix-free compression code whose codewords enumerated in the lexicographical order form a list of binary words in non-decreasing lengths. Gagie et al. (2015) gave a representation of this coding capable to encode or decode a symbol in constant worst case time. It uses σℓ_max + o(σ) + O(ℓ_max^2) bits of space, where σ and ℓ_max are the alphabet size and maximum codeword length, respectively. We refine their representation to reduce the space complexity to σℓ_max (1 + o(1)) bits while preserving the constant encode and decode times. Our algorithmic idea can be applied to any canonical code.
READ FULL TEXT