Escape characters for an antlr lexer and parser

Question

I am new to antlr and looking to build a parser, part of which requires me to match strings, however I am looking to preserve the meaning of escape characters

\0, \b, \t, \n, \f, \r, \", \', \\

Some of these symbols are used in various within the grammar of my language positions, hence I am looking to define an ESCAPE_CHAR token by:

SINGLE_QUOTE: '\'' ;
DOUBLE_QUOTE: '"' ;
ESCAPE_ZERO : '\0' ;
ESCAPE_BACKSPACE : '\b' ;
ESCAPE_TAB : '\t' ;
ESCAPE_NEWLINE : '\n' ;
ESCAPE_FORMFEED : '\f' ;
ESCAPE_CARRIAGERETURN : '\r' ;
ESCAPE_BACKSLASH : '\\' ;
ESCAPE_CHAR: ESCAPE_ZERO | ESCAPE_BACKSPACE | ESCAPE_TAB | ESCAPE_NEWLINE | ESCAPE_FORMFEED | ESCAPE_CARRIAGERETURN | DOUBLE_QUOTE | SINGLE_QUOTE | ESCAPE_BACKSLASH ;

However, ESCAPE_ZERO is giving me the warning

non-fragment lexer rule ESCAPE_CHAR can match the empty string

And when making ESCAPE_ZERO a fragment, I see the warning

invalid escape sequence \0

I am new to antlr so I don't really know what changes I need to make, any help would be greatly appreciated

Answer 1

You need to escape the \ inside a literal in ANTLR as well. If you don't, the lexer rule ESCAPE_ZERO: '\0'; matches the null character instead of a backslash followed by the zero digit. And this null character has no "width" which causes ANTLR to produce the error [...] can match the empty string .

Instead of all your separate rules, try something like this:

STRING
 : '"' ( ~[\\"\r\n] | ESCAPE_CHAR )* '"'
 ;

fragment ESCAPE_CHAR
 : '\\' [0btnfr"'\\]
 ;

Escape characters for an antlr lexer and parser

Question

1 answers

solution1
1 ACCPTED 2021-02-01 18:18:20

Escape characters for an antlr lexer and parser

Question

1 answers

solution1 1 ACCPTED 2021-02-01 18:18:20

solution1
1 ACCPTED 2021-02-01 18:18:20