简体   繁体   English

Jflex歧义

[英]Jflex ambiguity

I have these two rules from a jflex code:我有来自 jflex 代码的这两条规则:

Bool = true
Ident = [:letter:][:letterdigit:]*

if I try for example to analyse the word " trueStat ", it gets recognnized as an Ident expression and not Bool.例如,如果我尝试分析单词“ trueStat ”,它会被识别为 Ident 表达式而不是 Bool。 How can I avoid this type of ambiguity in Jflex?我怎样才能在 Jflex 中避免这种类型的歧义?

In almost all languages, a keyword is only recognised as such if it is a complete word.在几乎所有语言中,只有当关键字是一个完整的词时,它才会被识别为关键字。 Otherwise, you would end up banning identifiers like format , downtime and endurance (which would instead start with the keywords for , do and end , respectively).否则,您将最终禁止诸如formatdowntimeendurance之类的标识符(它们将分别以关键字fordoend开头)。 That's quite confusing for programmers, although it's not unheard-of.这对程序员来说是相当混乱的,尽管这并非闻所未闻。 Lexical scanner generators, like Flex and JFlex generally try to make the common case easy;词法扫描器生成器,如 Flex 和 JFlex,通常试图使常见情况变得简单; thus, the snippet you provide, which recognises trueStat as an identifier.因此,您提供的片段将trueStat识别为标识符。 But if you really want to recognise it as a keyword followed by an identifier, you can accomplish that by adding trailing context to all your keywords:但是如果你真的想将它识别为一个关键字后跟一个标识符,你可以通过为所有关键字添加尾随上下文来实现:

Bool = true/[:letterdigit:]*
Ident = [:letter:][:letterdigit:]*

With that pair of patterns, true will match the Bool rule, even if it occurs as trueStat .对于那对模式, true将匹配Bool规则,即使它以trueStat的形式出现。 The pattern matches true and any alphanumeric string immediately following it, and then rewinds the input cursor so that the token matched is just true .该模式匹配true和紧随其后的任何字母数字字符串,然后倒回输入 cursor 以便匹配的标记正好为true

Note that like Lex and Flex, JFlex accepts the longest match at the current input position;请注意,与 Lex 和 Flex 一样,JFlex 接受当前输入 position 处的最长匹配; if more than one rule accepts this match, the action corresponding to the first such rule is executed.如果有多个规则接受此匹配项,则执行与第一个此类规则对应的操作。 (See the manual section "How the Input is Matched" for a slightly longer explanation of the matching algorithm.) Trailing context is considered part of the match for the purposes of this rule (but, as noted above, is then removed from the match). (有关匹配算法的稍微更长的解释,请参阅手册“如何匹配输入”部分。)出于此规则的目的,尾随上下文被视为匹配的一部分(但是,如上所述,然后从匹配中删除).

The consequence of this rule is that you should always place more specific patterns before the general patterns they might override, whether or not the specific pattern uses trailing context.此规则的结果是您应该始终将更具体的模式放在它们可能覆盖的一般模式之前,无论特定模式是否使用尾随上下文。 So the Bool rule must precede the Ident rule.所以Bool规则必须在Ident规则之前。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM