简体   繁体   English

如何将这个语法转换为LR(1)?

[英]How to transform this grammar to LR(1)?

I have a grammar with an LR(1) conflict which I cannot resolve; 我有一个LR(1)冲突的语法,我无法解决; yet, the grammar should be unambiguous. 然而,语法应该是明确的。 I'll first demonstrate the problem on a simplified grammar with five tokens: ( , ) , {} , , and id . 我会先用五个标记表明一种简化的语法问题: (){} ,id

The EBNF would look like this: EBNF看起来像这样:

      args = ( id ',' )*

expression = id
           | '(' expression ')'
           | '(' args ')' '{}'

The grammar is unambiguous and requires at most two tokens of lookahead. 语法是明确的,最多需要两个前瞻标记。 When ( is shifted, there are only five possibilities: (转移时,只有五种可能性:

  1. ( → Recur. ( →重复。
  2. ) → Reduce as '(' args ')' . ) →减少为'(' args ')'
  3. id ) not {} → Reduce as '(' expression ')' . id ) not {} →缩小为'(' expression ')'
  4. id ) {} → Reduce as '(' args ')' '{}' id ) {} →缩小为'(' args ')' '{}'
  5. id , → Reduce as '(' args ')' '{}' (eventually). id , →缩小为'(' args ')' '{}' (最终)。

A naive translation yields the following result (and conflicts ): 天真的翻译产生以下结果(和冲突 ):

   formal_arg: Ident
               {}

  formal_args: formal_arg Comma formal_args
             | formal_arg
             | /* nothing */
               {}

      primary: Ident
             | LParen formal_args Curly
             | LParen primary RParen
               {}

So, the grammar requires at most three tokens of lookahead to decide. 因此,语法最多需要三个前瞻标记来决定。 I know that an LR(3) grammar can be transformed to LR(1) grammar. 我知道LR(3)语法可以转换为LR(1)语法。

However, I don't quite understand how to do the transformation in this particular case. 但是,我不太明白在这种特殊情况下如何进行转换。 Note that the simplified grammar above is an extraction from a larger body of code ; 请注意,上面简化的语法是从更大的代码体中提取 ; in particular, is it possible to transform primary without touching expr and everything above? 特别是,有可能转换primary而不触及expr和上面的一切?

I provided a solution to a problem very similar to this one here: Is C#'s lambda expression grammar LALR(1)? 我提供了一个解决这个问题的解决方案: C#的lambda表达式语法LALR(1)? . The basic idea was to separate out the ( id ) case from the other two possibilities ( ( expr_not_id ) and ( list_at_least_2_ids ) ). 基本思想是将( id )情况与其他两种可能性( ( expr_not_id )( list_at_least_2_ids )( list_at_least_2_ids ) Then the decision about how to reduce ( id ) can be deferred until the lookahead token is available (in your case, { , assuming that that is sufficient). 然后可以推迟关于如何减少( id )的决定,直到前瞻标记可用(在你的情况下, { ,假设这就足够了)。

Unfortunately, while the transformation of expr into expr_not_id is pretty straightforward and almost mechanical, it definitely touches a lot of productions. 不幸的是,虽然将expr转换为expr_not_id非常简单且几乎是机械的,但它绝对涉及很多制作。 Also, it's somewhat ugly. 而且,它有点难看。 So it fails to solve the problem you present in the last sentence. 所以它无法解决你在最后一句中出现的问题。 I don't actually think that it is possible to transform primary without touching expr , but I've been surprised before. 我实际上并不认为可以在不触及expr情况下转换primary ,但我之前感到惊讶。

(The other obvious solution, since the grammar is in fact unambiguous, is to use a GLR parser-generator, but I don't believe the parser-generator you are using has that feature.) (另一个明显的解决方案,因为语法实际上是明确的,是使用GLR解析器生成器,但我不相信你使用的解析器生成器具有该功能。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM