简体   繁体   English

Python / YACC:解决转变/减少冲突

[英]Python/YACC: Resolving a shift/reduce conflict

I'm using PLY. 我正在使用PLY。 Here is one of my states from parser.out : 这是我来自parser.out的状态之一

state 3

    (5) course_data -> course .
    (6) course_data -> course . course_list_tail
    (3) or_phrase -> course . OR_CONJ COURSE_NUMBER
    (7) course_list_tail -> . , COURSE_NUMBER
    (8) course_list_tail -> . , COURSE_NUMBER course_list_tail

  ! shift/reduce conflict for OR_CONJ resolved as shift
    $end            reduce using rule 5 (course_data -> course .)
    OR_CONJ         shift and go to state 7
    ,               shift and go to state 8

  ! OR_CONJ         [ reduce using rule 5 (course_data -> course .) ]

    course_list_tail               shift and go to state 9

I want to resolve this as: 我想解决这个问题:

if OR_CONJ is followed by COURSE_NUMBER:
    shift and go to state 7
else:
    reduce using rule 5 (course_data -> course .)

How can I fix my parser file to reflect this? 如何修复解析器文件以反映这一点? Do I need to handle a syntax error by backtracking and trying a different rule? 我是否需要通过回溯并尝试其他规则来处理语法错误?

The documentation says: 文件说:

These values are then used to attach a numerical precedence value and associativity direction to each grammar rule. 这些值然后用于将数字优先级值和关联性方向附加到每个语法规则。 This is always determined by looking at the precedence of the right-most terminal symbol. 这始终是通过查看最右边的终端符号的优先级来确定的。

What if the rule has no terminals? 如果规则没有终端怎么办?

UPDATE: The complete grammar: 更新:完整的语法:

Grammar

Rule 0     S' -> statement
Rule 1     statement -> course_data
Rule 2     or_phrase -> statement OR_CONJ statement
Rule 3     or_phrase -> course OR_CONJ COURSE_NUMBER
Rule 4     statement -> or_phrase
Rule 5     course_data -> course
Rule 6     course_data -> course course_list_tail
Rule 7     course_list_tail -> , COURSE_NUMBER
Rule 8     course_list_tail -> , COURSE_NUMBER course_list_tail
Rule 9     course -> DEPT_CODE COURSE_NUMBER

Your basic problem is that you need two tokens of lookahead to do what you want -- when the input seen so far is a course and the lookahead is a OR_CONJ you don't know whether to reduce the course to a course_data or shift without looking ahead two tokens to the token after the OR_CONJ . 您的基本问题是,您需要两个前瞻标记来完成您想要的工作-当到目前为止看到的输入是一个course而该前瞻是OR_CONJ您不知道是否将course缩小为course_data或不看就转移在OR_CONJ之后的令牌之前增加两个令牌。 There are a number of ways you can deal with this 您可以通过多种方式来处理此问题

  • use an LR(2) or LR(k) or GLR parser generator -- any can deal with this. 使用LR(2)或LR(k)或GLR解析器生成器-任何人都可以处理。

  • use a lexer hack to do the lookahead -- basically have the lexer return two different OR_CONJ tokens depending on whether the following token is a COURSE_NUMBER or not. 使用lexer hack进行前瞻-基本上让lexer根据以下令牌是否是COURSE_NUMBER返回两个不同的OR_CONJ令牌。

  • factor the grammar to get rid of the conflict, which may result in a grammar that parses something slightly different from what you want (need some extra post-parse checks to reject some invalid constructs) and will generally make the grammar much harder to understand. 分解语法以消除冲突,这可能会导致语法解析出与您想要的语法稍有不同的语法(需要进行额外的解析后检查以拒绝某些无效的构造),并且通常会使语法难以理解。

Note that your grammar as given is also ambiguous related to which way three or more courses connected in a single statement associate. 请注意,给定的语法也与模棱两可地关联单个语句中的三个或更多课程有关。 This is easily fixed by rewriting the grammar into a clearer left-recursive form: 通过将语法重写为更清晰的左递归形式,可以轻松解决此问题:

Rule 1    statement -> course
Rule 2    statement -> statement OR_CONJ course
Rule 3    course -> DEPT_CODE course_list
Rule 4    course -> DEPT CODE course_list OR_CONJ COURSE_NUMBER
Rule 5    course_list -> COURSE_NUMBER
Rule 6    course_list -> course_list , COURSE_NUMBER

This could also be rewritten as right-recursive for an LL parser generator, but it still has the 2-token lookahead problem. 对于LL解析器生成器,也可以将其重写为右递归,但是仍然存在2令牌超前问题。 One way of refactoring it to make that go away would be to make COURSE_NUMBER by itself a valid course and recombine it with the previous course in a post-pass (or give an error if its the first course in a statement ). 重构它以使其消失的一种方法是使COURSE_NUMBER本身成为有效的course并在通过COURSE_NUMBER其与先前的course重新组合(或者在statement的第一条course中给出错误)。 Then rule 4 becomes: 然后规则4变为:

Rule 4    course -> COURSE_NUMBER

and you have no conflicts. 而且你没有冲突。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM