The situation:
rule : block+ ;
block : '[' String ']' ;
String : ([a-z] | '[' | '\\]')+ ;
Trick is String can contain [ without backslash escape and ] with backslasash escape, so in this example:
[hello\]world][hello[[world]
First block can be parsed correctly, but the second one... parser is trying find ] for every [ . Is there way to say antlr parser to ignore this standalone [ ? I can't change format, but i need to find some workaround with antlr.
PS: Without antlr there is algorythm to avoid this, something like: collect [ in queue before we will find first ] and use only head of queue. But I really need antlr =_=
You can use Lexer modes.
Lexical modes allow us to split a single lexer grammar into multiple sublexers. The lexer can only return tokens matched by rules from the current mode.
You can read more about lexer rules in antlr documentation here .
First you will need to divide you grammar into separate lexer
and parser
. Than just use another mode after you see open bracket.
Parser grammar:
parser grammar TestParser;
options { tokenVocab=TestLexer; }
rul : block+ ;
block : LBR STRING RBR ;
Lexer grammar:
lexer grammar TestLexer;
LBR: '[' -> pushMode(InString);
mode InString;
STRING : ([a-z] | '\\]' | '[')+ ;
RBR: ']' -> popMode;
Working example is here .
You can read the documentation on lexer modes
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.