Adaptive Exact Learning in a Mixed-Up World: Dealing with Periodicity, Errors and Jumbled-Index Queries in String Reconstruction
We study the query complexity of exactly reconstructing a string from adaptive queries, such as substring, subsequence, and jumbled-index queries. Such problems have applications, e.g., in computational biology. We provide a number of new and improved bounds for exact string reconstruction for settings where either the string or the queries are "mixed-up". For example, we show that a periodic (i.e., "mixed-up") string, S=p^kp', of smallest period p, where |p'|<|p|, can be reconstructed using O(σ|p|+ n) substring queries, where σ is the alphabet size, if n=|S| is unknown. We also show that we can reconstruct S after having been corrupted by a small number of errors d, measured by Hamming distance. In this case, we give an algorithm that uses O(dσ|p| + d|p|n/d+1) queries. In addition, we show that a periodic string can be reconstructed using 2σ⌈ n⌉ + 2|p|⌈σ⌉ subsequence queries, and that general strings can be reconstructed using 2σ⌈ n⌉ + n⌈σ⌉ subsequence queries, without knowledge of n in advance. This latter result improves the previous best, decades-old result, by Skiena and Sundaram. Finally, we believe we are the first to study the exact-learning query complexity for string reconstruction using jumbled-index queries, which are a "mixed-up" typeA of query that have received much attention of late.
READ FULL TEXT