Linear-Time Algorithm for Long LCF with k Mismatches
In the Longest Common Factor with k Mismatches (LCF_k) problem, we are given two strings X and Y of total length n, and we are asked to find a pair of maximal-length factors, one of X and the other of Y, such that their Hamming distance is at most k. Thankachan et al. show that this problem can be solved in O(n ^k n) time and O(n) space for constant k. We consider the LCF_k(ℓ) problem in which we assume that the sought factors have length at least ℓ, and the LCF_k(ℓ) problem for ℓ=Ω(^2k+2 n), which we call the Long LCF_k problem. We use difference covers to reduce the Long LCF_k problem to a task involving m=O(n/^k+1n) synchronized factors. The latter can be solved in O(m ^k+1m) time, which results in a linear-time algorithm for Long LCF_k. In general, our solution to LCF_k(ℓ) for arbitrary ℓ takes O(n + n ^k+1 n/√(ℓ)) time.
READ FULL TEXT