简体   繁体   English

通过运算符优先级简化语法

[英]Simplifying grammar via operator precedence

I am trying to parse C.我正在尝试解析 C。 I have been consulting some free-context C grammars and I have observed they usually model expressions by using "chained" production rules, for example [here][1] something like this is done to model logical or and logical and expressions:我一直在咨询一些自由上下文 C 语法,我观察到它们通常使用“链式”生产规则 model 表达式,例如 [here][1] 对 Z20F35E630DAF44DBFAC4C3F6

<logical-or-expression> ::= <logical-and-expression>
                          | <logical-or-expression> || <logical-and-expression>

<logical-and-expression> ::= <inclusive-or-expression>
                           | <logical-and-expression> && <inclusive-or-expression>

I say the expressions are chained because they follow this structure:我说表达式是链式的,因为它们遵循以下结构:

expression with operator(N) ::= expression with operator(N+1)
        | (expression with operator(N)) operator(N) (expression with operator(N+1))

where N is the precedence of the operator.其中 N 是运算符的优先级。 I understand that objetive is to disambiguate the language and introduce precedence and association rules in a purely syntactic manner.我知道目标是消除语言的歧义,并以纯粹的句法方式引入优先级和关联规则。

Is there any reason to model expressions like this in an actual parser with operator precedence support?在具有运算符优先级支持的实际解析器中,这样的 model 表达式是否有任何理由? My initial idea was to implement them simply as:我最初的想法是将它们简单地实现为:

constant_expression ::= expression1 binary_op expression2

where binary_op is any binary operation and then disambiguate by setting the precedence of all the operators.其中 binary_op 是任何二元运算,然后通过设置所有运算符的优先级来消除歧义。 For example:例如:

logical_expr ::= simple_expr | logical_expr && logical_expr | logical_expr || logical_expr

and then set the precedence of && operator higher than ||.然后将 && 运算符的优先级设置为高于 ||。 I think this tactic would give a much simpler grammar, as it would eliminate the necessity of a different rule for every level of precedence but I am reluctant to use it because all the implementations I have seen use the former strategy, even in cases where the parser had precedence support.我认为这种策略会给出一个更简单的语法,因为它会消除每个优先级的不同规则的必要性,但我不愿意使用它,因为我看到的所有实现都使用前一种策略,即使在解析器有优先支持。

Thanks in advance.提前致谢。 [1]: https://cs.wmich.edu/~gupta/teaching/cs4850/sumII06/The%20syntax%20of%20C%20in%20Backus-Naur%20form.htm [1]: https://cs.wmich.edu/~gupta/teaching/cs4850/sumII06/The%20syntax%20of%20C%20in%20Backus-Naur%20form.htm

Many LR-style parsers can handle operator precedence rules using some mechanism external to the grammar itself in part because it allows you to skip this “layering” approach to writing CFGs.许多 LR 风格的解析器可以使用语法本身外部的某种机制来处理运算符优先规则,部分原因是它允许您跳过这种“分层”方法来编写 CFG。 If you have a parser generator that supports this, it's fine to write an ambiguous grammar and then add those external rules in to get the precedence and associativity right.如果您有一个支持此功能的解析器生成器,则可以编写一个模棱两可的语法,然后添加这些外部规则以获得正确的优先级和关联性。

As a note - parsers for CFGs and BNF rules usually are insensitive to the order in which rules are written, so listing the operators from highest-precedence to lowest-precedence alone isn't sufficient.请注意 - CFG 和 BNF 规则的解析器通常对编写规则的顺序不敏感,因此仅列出从最高优先级到最低优先级的运算符是不够的。 (PEG parsers, on the other hand, do represent ordered choices). (另一方面,PEG 解析器确实代表有序选择)。 Also, due to how most parser generators work (having code to execute associated with each production, and using the terminals in a production to determine operator precedence), it's often easier to have separate rules, one per binary operator, than it is to have one rule of the form “Expr Operator Expr.”此外,由于大多数解析器生成器的工作方式(具有与每个生产相关联的要执行的代码,并使用生产中的终端来确定运算符优先级),通常更容易拥有单独的规则,每个二元运算符一个,而不是拥有“Expr Operator Expr”形式的一条规则。 But otherwise the basic approach is sound.但除此之外,基本方法是合理的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM