简体   繁体   English

Python词法分析器词法分析令牌优先级规则顺序处理歧义-为什么STRING优先于WORD?

[英]Python lexer lexical analysis token priority rule order dealing with ambiguities — why STRING has priority over WORD?

I am studying lexer at Programming Languages course by Westley Weimer . 我正在Westley Weimer的“编程语言”课程中学习词法分析器。

The notes are here https://www.udacity.com/wiki/cs262/unit-2#quiz-rule-order 注释在这里https://www.udacity.com/wiki/cs262/unit-2#quiz-rule-order

{Video, if you care to watch, last 40 seconds.} https://www.udacity.com/course/viewer#!/c-cs262/l-48713810/e-48652568/m-48676965 {视频,如果您愿意观看,请持续40秒。} https://www.udacity.com/course/viewer#!/c-cs262/l-48713810/e-48652568/m-48676965

Quiz: When two token definitions can match the same string, the behavior of our lexical analyzer may be ambiguous..... 测验:当两个标记定义可以匹配相同的字符串时,我们的词法分析器的行为可能会模棱两可.....

Suppose we have the input string 假设我们有输入字符串

hello, "world," 你好,世界,”

and we want the input string to yield WORD STRING . 并且我们希望输入字符串产生WORD STRING。 Which rule must come last? 哪个规则必须最后? ie "......what I'd like you to do is tell me which one of these functions, which one of these rules, would have to come last, bearing in mind that the one that comes first wins all ties in order for hello, "world" to break down into a word followed by a string." 即“ ......我想让你做的就是告诉我这些功能中的哪一个,这些规则中的哪一个必须倒数第二次,并牢记最先出现的那个会赢得所有的联系。为了打招呼,“世界”分解为一个单词,然后是一个字符串。”

def t_WORD(token):
    r'[^ <>]+'


def t_STRING(token)                             
    r'"[^"]*"'

........The answer is: ........答案是:

t_STRING , then t_WORD t_STRING,然后t_WORD

..........I don't get it even after watching the video multiple times, why does STRING have priority over WORD? ..........即使多次观看视频,我也听不到,为什么STRING的优先级高于WORD?

Please enlighten. 请指教。

Thank you very much. 非常感谢你。

Both patterns match "world" , but the desire is that the token t_STRING be returned. 两种模式都匹配"world" ,但是希望返回令牌t_STRING To do that, t_STRING needs to have priority, so it must be placed first, because if there are two or more patterns with the same longest match, the earliest pattern wins. 为此, t_STRING必须具有优先级,因此必须放在第一位,因为如果存在两个或多个具有相同最长匹配的模式,则最早的模式将获胜。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM