Understandable errors in ANTLR4
There is more than one way to peel an orange!
Once a colleague told me: “you can’t really generate user-friendly error messages with ANTLR. This didn’t seem right - serious parser generators must have ways to generate proper errors…
Online searching has shown approaches to error handling mostly revolve around either various implementations of ANTLRErrorStrategy or “fail fast” strategy that involves overriding implementation of DefaultErrorStrategy to throw ParseCancellationException, which would cause parsing to stop at the first syntax error.
Those approaches were nice, but I wanted to find a way that would allow me to control both error messages and the “offending token” - syntax token to be highlighted in UI when showing syntax errors.
Consider the following ANTLR grammar:
1 | grammar TestStrings; |
This combined grammar parses character sequence and detects C-style quoted strings. Now, in order to add custom errors in a declarative way, we will add a custom error to be thrown if the string is missing a closing quote.
1 | grammar TestStrings; |
Note how _input.Lt(-1) is used to specify which token in the lexer stream is the “problematic” one, by using offset from current position in the stream.
Coupled with the following simple error listener, specifying errors like this provided me with what I wanted.
1 | public readonly struct SyntaxError |
As I am not an expert on ANTLR, if you think this can be done in a better way or you think this is not a good way of handling errors in ANTLR, do let me know!
There is always a place for improvement.