Regular Expressions with Backreferences: Polynomial-Time Matching Techniques
Regular expressions with backreferences (regex, for short), as supported by most modern libraries for regular expression matching, have an NP-complete matching problem. We define a complexity parameter of regex, called active variable degree, such that regex with this parameter bounded by a constant can be matched in polynomial-time. Moreover, we formulate a novel type of determinism for regex (on an automaton-theoretic level), which yields the class of memory-deterministic regex that can be matched in time O(|w|p(|r|)) for a polynomial p (where r is the regex and w the word). Natural extensions of these concepts lead to properties of regex that are intractable to check.
READ FULL TEXT