[英]pyparsing variables that refer to each other
I am using pyparsing
to write a small grammar, but I'm running into an issue where the definition of one variable requires me to use another variable that itself requires the first one, etc. This answer helped me some, since I am working with Prolog strings, but it didn't seem to address the issue I'm having.我正在使用
pyparsing
编写一个小语法,但我遇到了一个问题,即一个变量的定义要求我使用另一个本身需要第一个变量的变量,等等。 这个答案帮助了我一些,因为我正在使用Prolog 字符串,但它似乎没有解决我遇到的问题。
import pyparsing as pp
# grammar
argument = pp.Word(pp.alphas)
pred = pp.Word(pp.alphas) + (pp.Suppress('(') + pp.delimitedList(argument) + pp.Suppress(')'))
# tests
test_string1="func(x, y)"
test_result1 = pred.parse_string(test_string1)
print(test_result1.dump())
import pyparsing as pp
# grammar
argument = pp.Word(pp.alphas) | pred
pred = pp.Word(pp.alphas) + (pp.Suppress('(') + pp.delimitedList(argument) + pp.Suppress(')'))
# tests
test_string2="func(func(x), y)"
test_result2 = pred.parse_string(test_string2)
print(test_result2.dump())
The idea is that a predication might have as one of its arguments another predication which itself might have more arguments or predications etc. However, in the first line of the grammar in the broken example, the code breaks because it is trying to reference pred
, which hasn't been assigned yet.这个想法是一个谓词可能有另一个谓词作为它的参数之一,它本身可能有更多的参数或谓词等。但是,在损坏示例的语法的第一行中,代码中断是因为它试图引用
pred
,尚未分配。 I can switch the order, but the problem will persist, because pred
requires the usage of argument
since a pred
can have an argument
.我可以切换顺序,但问题仍然存在,因为
pred
需要使用argument
,因为pred
可以有argument
。
On paper I think it should work, since eventually it would terminate, but because of the fact that I have to declare things in a certain order it doesn't seem to work in code form.在纸面上,我认为它应该可以工作,因为最终它会终止,但是由于我必须以某种顺序声明事物,因此它似乎无法以代码形式工作。
The information you provided is insufficient.您提供的信息不充分。 Please provide minimum reproducible example
请提供最小的可重现示例
This is a common issue with grammars, where there is some recursion in the definition of things.这是语法的一个常见问题,其中事物的定义存在一些递归。
I often encourage parser developers to take a moment and write up a BNF before diving into writing code.我经常鼓励解析器开发人员在开始编写代码之前花点时间编写一个 BNF。 This is good practice whether you are using pyparsing or any other parsing library.
无论您使用的是 pyparsing 还是任何其他解析库,这都是一种很好的做法。 Here is one that I came up with, based on your code:
这是我根据您的代码提出的一个:
BNF:
pred ::= identifier '(' args ')'
args ::= argument [, argument]...
argument ::= identifier | pred
identifier ::= word of one or more alphas
(Jumping ahead, pyparsing will create a railroad diagram of the BNF for you.) (继续前进,pyparsing 将为您创建 BNF 的铁路图。)
Now you can do a mental walkthrough of your test string using this BNF.现在您可以使用此 BNF 对您的测试字符串进行心理演练。 Looking at your string "func(func(x), y)", I have some questions about this BNF:
查看您的字符串“func(func(x), y)”,我对这个 BNF 有一些疑问:
identifier | pred
identifier | pred
identifier | pred
may cause some ambiguity when parsing with pyparsing, since pred
also starts with an identifier
. identifier | pred
在使用 pyparsing 解析时可能会导致一些歧义,因为pred
也以identifier
开头。 We will either need to use '^' operator for pyparsing (which does a longest match), or reorder to match pred | identifier
pred | identifier
pred | identifier
so that the more complex expression is tried before the simpler one. pred | identifier
,以便在更简单的表达式之前尝试更复杂的表达式。 You rightly determine that there is recursion in this grammar.您正确地确定此语法中存在递归。 Pyparsing lets you define a "I need to use this now but I'll define it later" term using the
Forward
class (from the "forward" declaration in Pascal). Pyparsing 允许您使用
Forward
类(来自 Pascal 中的“forward”声明)定义“我现在需要使用它,但稍后我会定义它”术语。
pred = pp.Forward()
Then translating this to pyparsing (working bottom-up) looks like:然后将其转换为 pyparsing(自下而上)如下所示:
identifier = pp.Word(pp.alphas)
argument = pred | identifier
args = pp.delimitedList(argument)
And then to insert the definition into the existing pred
term, use the <<=
operator instead of =
:然后要将定义插入现有的
pred
术语,请使用<<=
运算符而不是=
:
pred <<= identifier + pp.Suppress("(") + args + pp.Suppress(")")
I got so tired of writing test loops for test inputs to parsers that I wrote the run_tests() method, which you would call like this:我厌倦了为解析器的测试输入编写测试循环,所以我编写了 run_tests() 方法,你可以这样调用它:
pred.run_tests("""\
func(func(x), y)
""")
Which echoes the input string and then dumps the output:它回显输入字符串,然后转储输出:
func(func(x), y)
['func', 'func', 'x', 'y']
So a couple more notes:所以还有一些注意事项:
To make the arguments optional, wrap in a pp.Opt()要使参数可选,请包装在 pp.Opt()
To preserve structure, wrap logical groups of terms in pp.Group.为了保持结构,在 pp.Group 中包装术语的逻辑组。 I would suggest wrapping both
pred
and the '('+args+')' part of pred
:我建议包装
pred
和pred
的 '('+args+')' 部分:
pred <<= pp.Group(identifier + pp.Group(pp.Suppress("(") + args + pp.Suppress(")")))
Which will then give this run_tests output:然后会给出这个 run_tests 输出:
func(func(x), y)
[['func', [['func', ['x']], 'y']]]
[0]:
['func', [['func', ['x']], 'y']]
[0]:
func
[1]:
[['func', ['x']], 'y']
[0]:
['func', ['x']]
[0]:
func
[1]:
['x']
[1]:
y
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.