A Semi-Automatic Approach for Syntax Error Reporting and Recovery in Parsing Expression Grammars
Error recovery is an essential feature for a parser that should be plugged in Integrated Development Environments (IDEs), which must build Abstract Syntax Trees (ASTs) even for syntactically invalid programs in order to offer features such as automated refactoring and code completion. Parsing Expressions Grammars (PEGs) are a formalism that naturally describes recursive top-down parsers using a restricted form of backtracking. Labeled failures are a conservative extension of PEGs that adds an error reporting mechanism for PEG parsers, and these labels can also be associated with recovery expressions to provide an error recovery mechanism. These expressions can use the full expressivity of PEGs to recover from syntactic errors. Manually annotating a large grammar with labels and recovery expressions can be difficult. In this work, we present algorithms that automatically annotates a PEG with labels, and build their corresponding recovery expressions. We evaluate these algorithms by using them to generate error recovering parsers for four programming languages: Titan, Pascal, C and Java. The results show that with a small amount of manual intervention our approach can be used to produce error recovering parsers for PEGs where most choices have alternatives with disjoint FIRST sets.
READ FULL TEXT