简体   繁体   English

如何最好地解析 PEG 语法中的逗号分隔列表

[英]How best to parse a comma separate list in PEG grammar

I'm trying to parse a comma separated list.我正在尝试解析逗号分隔列表。 To simplify, I'm just using digits.为简化起见,我只使用数字。 These expressions would be valid:这些表达式是有效的:

(1, 4, 3) (1, 4, 3)

() ()

(4) (4)

I can think of two ways to do this and I'm wondering why exactly the failed example does not work.我可以想到两种方法来做到这一点,我想知道为什么失败的例子不起作用。 I believe it is a correct BNF, but I can't get it to work as PEG.我相信它是一个正确的 BNF,但我无法让它作为 PEG 工作。 Can anyone explain why exactly?任何人都可以解释为什么吗? I'm trying to get a better understanding of the PEG parsing logic.我试图更好地理解 PEG 解析逻辑。

I'm testing using the online browser parser generator here: https://pegjs.org/online我在这里使用在线浏览器解析器生成器进行测试: https://pegjs.org/online

This does not work:这不起作用:

list = '(' some_digits? ')'
some_digits = digit / ', ' some_digits
digit = [0-9]

(actually, it parses okay, and likes () or (1) but doesn't recognize (1, 2) (实际上,它解析正常,喜欢 () 或 (1) 但不识别 (1, 2)

But this does work:但这确实有效:

list = '(' some_digits? ')'
some_digits = digit another_digit*
another_digit = ', ' digit
digit = [0-9]

Why is that?这是为什么? (Grammar novice here) (这里是语法新手)

Cool question and after digging around in their docs for a second I found that the / character means: 很酷的问题,在他们的文档中挖了一秒后,我发现/字符意味着:

Try to match the first expression, if it does not succeed, try the second one, etc. Return the match result of the first successfully matched expression. 尝试匹配第一个表达式,如果不成功,请尝试第二个表达式,等等。返回第一个成功匹配表达式的匹配结果。 If no expression matches, consider the match failed. 如果没有表达式匹配,请考虑匹配失败。

So this lead me to the solution: 所以这引出了我的解决方案:

list = '(' some_digits? ')'
some_digits = digit ', ' some_digits / digit
digit = [0-9]

The reason this works: 这有效的原因:

input: (1, 4) 输入:(1,4)

  • eats '(' 吃'('
  • check are there some digits? 检查是否有一些数字?
  • check some_digits - first condition: 检查some_digits - 第一个条件:
    • eats '1' 吃'1'
    • eats ', ' 吃','
    • check some_digits - first condition: 检查some_digits - 第一个条件:
      • eats '4' 吃'4'
      • fails to eat ', ' 没吃','
    • check some_digits - second condition: 检查some_digits - 第二个条件:
      • eats '4' 吃'4'
      • succeeds 成功
    • succeeds 成功
  • eats ')' 吃')'
  • succeeds 成功

if you reverse the order of the some_digits conditions the first number is comes across gets eaten by digit and no recursion occurs. 如果你颠倒some_digits条件的顺序,第一个数字就会被digit吃掉而不会发生递归。 Then it throws an error because ')' is not present. 然后它会抛出一个错误,因为')'不存在。

In one line:在一行中:

some_digits = '(' digit (', ' digit)* ')'

It depends on what you want with the values and on the PEG implementation, but extracting them might be easier this way.这取决于您想要什么值和 PEG 实现,但以这种方式提取它们可能更容易。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM