antlr4 rule not ignoring standalone open bracket

Question

The situation:

rule   : block+ ;
block  : '[' String ']' ;
String : ([a-z] | '[' | '\\]')+ ;

Trick is String can contain [ without backslash escape and ] with backslasash escape, so in this example:

[hello\]world][hello[[world]

First block can be parsed correctly, but the second one... parser is trying find ] for every [ . Is there way to say antlr parser to ignore this standalone [ ? I can't change format, but i need to find some workaround with antlr.

PS: Without antlr there is algorythm to avoid this, something like: collect [ in queue before we will find first ] and use only head of queue. But I really need antlr =_=

Answer 1

You can use Lexer modes.

Lexical modes allow us to split a single lexer grammar into multiple sublexers. The lexer can only return tokens matched by rules from the current mode.

You can read more about lexer rules in antlr documentation here .

First you will need to divide you grammar into separate lexer and parser . Than just use another mode after you see open bracket.

Parser grammar:

parser grammar TestParser;

options { tokenVocab=TestLexer; }

rul   : block+ ;
block  : LBR STRING RBR ;

Lexer grammar:

lexer grammar TestLexer;

LBR: '[' -> pushMode(InString);

mode InString;

STRING : ([a-z] | '\\]' | '[')+ ;
RBR: ']' -> popMode;

Working example is here .

You can read the documentation on lexer modes

antlr4 rule not ignoring standalone open bracket

Question

1 answers

solution1
1 ACCPTED 2016-01-24 21:10:42

antlr4 rule not ignoring standalone open bracket

Question

1 answers

solution1 1 ACCPTED 2016-01-24 21:10:42

solution1
1 ACCPTED 2016-01-24 21:10:42