解决ANTLR4中的Lexer和Parser歧义

Question

In ANTLR4 I have a lexer rule that says that I can get any word using any character but spaces and line breaks. 在ANTLR4中，我有一个词法分析器规则，该规则说我可以使用任何字符（空格和换行符）获取任何单词。 It is defined as this: 定义如下：

WORD : ~[ \t\r\n:,]+;

I also have a lexer rule (defined before than WORD) for going to an EVAL mode: 我还具有进入EVAL模式的词法分析器规则（比WORD之前定义）：

OPENEVAL : '${' -> pushMode(EVAL);

mode EVAL;
CLOSEEVAL : '}' -> popMode;
... (more lexer definitions for EVAL mode) ...

In the parser file I'm trying to detect a grammar rule OR a word. 在解析器文件中，我试图检测语法规则或单词。 So I do the following: 因此，我执行以下操作：

eval : evaluation
     | WORD;

evaluation : OPENEVAL somestuff CLOSEEVAL;

somestuff uses lexer rules defined in the EVAL mode. somestuff使用在EVAL模式下定义的词法分析器规则。 The problem is, when evaluating the eval rule, it identifies the text as a WORD token, and not as a evalution grammar rule. 问题是，在评估评估规则时，它将文本识别为WORD令牌，而不是评估语法规则。 I mean, if I enter some text like: 我的意思是，如果我输入一些文本，例如：

${stuff to be evaluated}

It should go to the evaluation rule, but instead, it identifies it as a WORD (taking the "${stuff" part only) 它应该转到评估规则，但是，它将其标识为WORD（仅使用“ $ {stuff”部分）

I know that there is an ambiguity between evaluation and WORD, but I thought that ANTLR was going to take the first coincidence of the parser rule ( evaluation in this case). 我知道评估和WORD之间存在歧义，但我认为ANTLR将采用解析器规则的第一个巧合（在本例中为评估）。

Sorry if this is too confusing, I tried to summarize this as good as possible (I didn't want to put the full parser and lexer contents to avoid a wall of text basically). 抱歉，如果这太令人困惑，我尝试将其尽可能地加以总结（我不想放入完整的解析器和词法分析器内容，从而基本上避免了文本墙）。

Another option I considered was to define "WORD" as anything but text surrounded by ${ and }. 我考虑过的另一种选择是将“ WORD”定义为除$ {和}包围的文本以外的任何内容。 But I don't know how to create such a lexer rule. 但是我不知道如何创建这样的词法分析器规则。

How could I solve this? 我该如何解决？ To distinguish between evaluation and WORD? 区分评估和WORD？

Answer 1

You need to include a predicate preventing the inclusion of $ in a WORD when its followed by { . 您需要包含一个谓词，以防止$后面跟着{时在WORD包含$ 。

WORD
  : ( ~[ \t\r\n:,$]
    | '$' {_input.LA(1) != '{'}?
    )+
  ;

解决ANTLR4中的Lexer和Parser歧义

问题描述

1 个解决方案

解决方案1
1 已采纳 2014-01-06 13:30:22

解决ANTLR4中的Lexer和Parser歧义

问题描述

1 个解决方案

解决方案1 1 已采纳 2014-01-06 13:30:22

解决方案1
1 已采纳 2014-01-06 13:30:22