简体   繁体   English

解决ANTLR4中的Lexer和Parser歧义

[英]Resolving Lexer and Parser ambiguities in ANTLR4

In ANTLR4 I have a lexer rule that says that I can get any word using any character but spaces and line breaks. 在ANTLR4中,我有一个词法分析器规则,该规则说我可以使用任何字符(空格和换行符)获取任何单词。 It is defined as this: 定义如下:

WORD : ~[ \t\r\n:,]+;

I also have a lexer rule (defined before than WORD) for going to an EVAL mode: 我还具有进入EVAL模式的词法分析器规则(比WORD之前定义):

OPENEVAL : '${' -> pushMode(EVAL);

mode EVAL;
CLOSEEVAL : '}' -> popMode;
... (more lexer definitions for EVAL mode) ...

In the parser file I'm trying to detect a grammar rule OR a word. 在解析器文件中,我试图检测语法规则或单词。 So I do the following: 因此,我执行以下操作:

eval : evaluation
     | WORD;

evaluation : OPENEVAL somestuff CLOSEEVAL;

somestuff uses lexer rules defined in the EVAL mode. somestuff使用在EVAL模式下定义的词法分析器规则。 The problem is, when evaluating the eval rule, it identifies the text as a WORD token, and not as a evalution grammar rule. 问题是,在评估评估规则时,它将文本识别为WORD令牌,而不是评估语法规则。 I mean, if I enter some text like: 我的意思是,如果我输入一些文本,例如:

${stuff to be evaluated}

It should go to the evaluation rule, but instead, it identifies it as a WORD (taking the "${stuff" part only) 它应该转到评估规则,但是,它将其标识为WORD(仅使用“ $ {stuff”部分)

I know that there is an ambiguity between evaluation and WORD, but I thought that ANTLR was going to take the first coincidence of the parser rule ( evaluation in this case). 我知道评估和WORD之间存在歧义,但我认为ANTLR将采用解析器规则的第一个巧合(在本例中为评估 )。

Sorry if this is too confusing, I tried to summarize this as good as possible (I didn't want to put the full parser and lexer contents to avoid a wall of text basically). 抱歉,如果这太令人困惑,我尝试将其尽可能地加以总结(我不想放入完整的解析器和词法分析器内容,从而基本上避免了文本墙)。

Another option I considered was to define "WORD" as anything but text surrounded by ${ and }. 我考虑过的另一种选择是将“ WORD”定义为除$ {和}包围的文本以外的任何内容。 But I don't know how to create such a lexer rule. 但是我不知道如何创建这样的词法分析器规则。

How could I solve this? 我该如何解决? To distinguish between evaluation and WORD? 区分评估和WORD?

You need to include a predicate preventing the inclusion of $ in a WORD when its followed by { . 您需要包含一个谓词,以防止$后面跟着{时在WORD包含$

WORD
  : ( ~[ \t\r\n:,$]
    | '$' {_input.LA(1) != '{'}?
    )+
  ;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM