简体   繁体   English

将BNF语法转换为pyparsing

[英]Convert BNF grammar to pyparsing

How can I describe a grammar using regex (or pyparsing is better?) for a script languge presented below (Backus–Naur Form): 如何使用正则表达式(或pyparsing更好?)来描述下面提供的脚本语言(Backus-Naur Form):

<root>   :=     <tree> | <leaves>
<tree>   :=     <group> [* <group>] 
<group>  :=     "{" <leaves> "}" | <leaf>;
<leaves> :=     {<leaf>;} leaf
<leaf>   :=     <name> = <expression>{;}

<name>          := <string_without_spaces_and_tabs>
<expression>    := <string_without_spaces_and_tabs>

Example of the script: 脚本示例:

{
 stage = 3;
 some.param1 = [10, 20];
} *
{
 stage = 4;
 param3 = [100,150,200,250,300]
} *
 endparam = [0, 1]

I use python re.compile and want to divide everything in groups, something like this: 我使用python re.compile并希望将所有内容分组,如下所示:

[ [ 'stage',       '3'],
  [ 'some.param1', '[10, 20]'] ],

[ ['stage',  '4'],
  ['param3', '[100,150,200,250,300]'] ],

[ ['endparam', '[0, 1]'] ]

Updated: I've found out that pyparsing is much better solution instead of regex. 更新:我发现pyparsing是更好的解决方案,而不是正则表达式。

Pyparsing lets you simplify some of these kinds of constructs Pyparsing允许您简化这些类型的构造

leaves :: {leaf} leaf

to just 只是

OneOrMore(leaf)

So one form of your BNF in pyparsing will look something like: 所以你的BNF在pyparsing中的一种形式看起来像:

from pyparsing import *

LBRACE,RBRACE,EQ,SEMI = map(Suppress, "{}=;")
name = Word(printables, excludeChars="{}=;")
expr = Word(printables, excludeChars="{}=;") | quotedString

leaf = Group(name + EQ + expr + SEMI)
group = Group(LBRACE + ZeroOrMore(leaf) + RBRACE) | leaf
tree = OneOrMore(group)

I added quotedString as an alternative expr, in case you wanted to have something that did include one of the excluded chars. 我加入quotedString作为替代expr的,如果你想拥有的东西, 包括排除字符之一。 And adding Group around leaf and group will maintain the bracing structure. 并且在叶子和组周围添加组将保持支撑结构。

Unfortunately, your sample doesn't quite conform to this BNF: 不幸的是,您的样本并不完全符合此BNF:

  1. spaces in [10, 20] and [0, 1] make them invalid exprs [10, 20][0, 1] [10, 20]空格使它们成为无效的exprs

  2. some leafs do not have terminating ; 有些叶子没有终止; s 小号

  3. lone * characters - ??? 孤独的*字符 - ???

This sample does parse successfully with the above parser: 此示例使用上述解析器成功解析:

sample = """
{
 stage = 3;
 some.param1 = [10,20];
}
{
 stage = 4;
 param3 = [100,150,200,250,300];
}
 endparam = [0,1];
 """

parsed = tree.parseString(sample)    
parsed.pprint()

Giving: 赠送:

[[['stage', '3'], ['some.param1', '[10,20]']],
 [['stage', '4'], ['param3', '[100,150,200,250,300]']],
 ['endparam', '[0,1]']]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM