[英]Jflex ambiguity
I have these two rules from a jflex code:我有来自 jflex 代码的这两条规则:
Bool = true
Ident = [:letter:][:letterdigit:]*
if I try for example to analyse the word " trueStat ", it gets recognnized as an Ident expression and not Bool.例如,如果我尝试分析单词“ trueStat ”,它会被识别为 Ident 表达式而不是 Bool。 How can I avoid this type of ambiguity in Jflex?
我怎样才能在 Jflex 中避免这种类型的歧义?
In almost all languages, a keyword is only recognised as such if it is a complete word.在几乎所有语言中,只有当关键字是一个完整的词时,它才会被识别为关键字。 Otherwise, you would end up banning identifiers like
format
, downtime
and endurance
(which would instead start with the keywords for
, do
and end
, respectively).否则,您将最终禁止诸如
format
、 downtime
和endurance
之类的标识符(它们将分别以关键字for
、 do
和end
开头)。 That's quite confusing for programmers, although it's not unheard-of.这对程序员来说是相当混乱的,尽管这并非闻所未闻。 Lexical scanner generators, like Flex and JFlex generally try to make the common case easy;
词法扫描器生成器,如 Flex 和 JFlex,通常试图使常见情况变得简单; thus, the snippet you provide, which recognises
trueStat
as an identifier.因此,您提供的片段将
trueStat
识别为标识符。 But if you really want to recognise it as a keyword followed by an identifier, you can accomplish that by adding trailing context to all your keywords:但是如果你真的想将它识别为一个关键字后跟一个标识符,你可以通过为所有关键字添加尾随上下文来实现:
Bool = true/[:letterdigit:]*
Ident = [:letter:][:letterdigit:]*
With that pair of patterns, true
will match the Bool
rule, even if it occurs as trueStat
.对于那对模式,
true
将匹配Bool
规则,即使它以trueStat
的形式出现。 The pattern matches true
and any alphanumeric string immediately following it, and then rewinds the input cursor so that the token matched is just true
.该模式匹配
true
和紧随其后的任何字母数字字符串,然后倒回输入 cursor 以便匹配的标记正好为true
。
Note that like Lex and Flex, JFlex accepts the longest match at the current input position;请注意,与 Lex 和 Flex 一样,JFlex 接受当前输入 position 处的最长匹配; if more than one rule accepts this match, the action corresponding to the first such rule is executed.
如果有多个规则接受此匹配项,则执行与第一个此类规则对应的操作。 (See the manual section "How the Input is Matched" for a slightly longer explanation of the matching algorithm.) Trailing context is considered part of the match for the purposes of this rule (but, as noted above, is then removed from the match).
(有关匹配算法的稍微更长的解释,请参阅手册“如何匹配输入”部分。)出于此规则的目的,尾随上下文被视为匹配的一部分(但是,如上所述,然后从匹配中删除).
The consequence of this rule is that you should always place more specific patterns before the general patterns they might override, whether or not the specific pattern uses trailing context.此规则的结果是您应该始终将更具体的模式放在它们可能覆盖的一般模式之前,无论特定模式是否使用尾随上下文。 So the
Bool
rule must precede the Ident
rule.所以
Bool
规则必须在Ident
规则之前。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.