简体   繁体   English

pyparsing 相互引用的变量

[英]pyparsing variables that refer to each other

I am using pyparsing to write a small grammar, but I'm running into an issue where the definition of one variable requires me to use another variable that itself requires the first one, etc. This answer helped me some, since I am working with Prolog strings, but it didn't seem to address the issue I'm having.我正在使用pyparsing编写一个小语法,但我遇到了一个问题,即一个变量的定义要求我使用另一个本身需要第一个变量的变量,等等。 这个答案帮助了我一些,因为我正在使用Prolog 字符串,但它似乎没有解决我遇到的问题。

Working example, without the issue工作示例,没有问题

import pyparsing as pp

# grammar
argument = pp.Word(pp.alphas)
pred = pp.Word(pp.alphas) + (pp.Suppress('(') + pp.delimitedList(argument) + pp.Suppress(')'))


# tests
test_string1="func(x, y)"

test_result1 = pred.parse_string(test_string1)
print(test_result1.dump())

Broken example, with the issue破碎的例子,有问题

import pyparsing as pp

# grammar
argument = pp.Word(pp.alphas) | pred
pred = pp.Word(pp.alphas) + (pp.Suppress('(') + pp.delimitedList(argument) + pp.Suppress(')'))


# tests
test_string2="func(func(x), y)"

test_result2 = pred.parse_string(test_string2)
print(test_result2.dump())

The idea is that a predication might have as one of its arguments another predication which itself might have more arguments or predications etc. However, in the first line of the grammar in the broken example, the code breaks because it is trying to reference pred , which hasn't been assigned yet.这个想法是一个谓词可能有另一个谓词作为它的参数之一,它本身可能有更多的参数或谓词等。但是,在损坏示例的语法的第一行中,代码中断是因为它试图引用pred ,尚未分配。 I can switch the order, but the problem will persist, because pred requires the usage of argument since a pred can have an argument .我可以切换顺序,但问题仍然存在,因为pred需要使用argument ,因为pred可以有argument

On paper I think it should work, since eventually it would terminate, but because of the fact that I have to declare things in a certain order it doesn't seem to work in code form.在纸面上,我认为它应该可以工作,因为最终它会终止,但是由于我必须以某种顺序声明事物,因此它似乎无法以代码形式工作。

The information you provided is insufficient.您提供的信息不充分。 Please provide minimum reproducible example请提供最小的可重现示例

This is a common issue with grammars, where there is some recursion in the definition of things.这是语法的一个常见问题,其中事物的定义存在一些递归。

I often encourage parser developers to take a moment and write up a BNF before diving into writing code.我经常鼓励解析器开发人员在开始编写代码之前花点时间编写一个 BNF。 This is good practice whether you are using pyparsing or any other parsing library.无论您使用的是 pyparsing 还是任何其他解析库,这都是一种很好的做法。 Here is one that I came up with, based on your code:这是我根据您的代码提出的一个:

BNF:
    pred ::= identifier '(' args ')'
    args ::= argument [, argument]...
    argument ::= identifier | pred
    identifier ::= word of one or more alphas

(Jumping ahead, pyparsing will create a railroad diagram of the BNF for you.) (继续前进,pyparsing 将为您创建 BNF 的铁路图。) 在此处输入图像描述

Now you can do a mental walkthrough of your test string using this BNF.现在您可以使用此 BNF 对您的测试字符串进行心理演练。 Looking at your string "func(func(x), y)", I have some questions about this BNF:查看您的字符串“func(func(x), y)”,我对这个 BNF 有一些疑问:

  1. Do preds always have arguments? preds 总是有争论吗? (Should the contained args be optional?) (包含的参数应该是可选的吗?)
  2. Defining argument as identifier | pred将参数定义为identifier | pred identifier | pred may cause some ambiguity when parsing with pyparsing, since pred also starts with an identifier . identifier | pred在使用 pyparsing 解析时可能会导致一些歧义,因为pred也以identifier开头。 We will either need to use '^' operator for pyparsing (which does a longest match), or reorder to match pred | identifier我们将需要使用 '^' 运算符进行 pyparsing(它进行最长匹​​配),或者重新排序以匹配pred | identifier pred | identifier so that the more complex expression is tried before the simpler one. pred | identifier ,以便在更简单的表达式之前尝试更复杂的表达式。

You rightly determine that there is recursion in this grammar.您正确地确定此语法中存在递归。 Pyparsing lets you define a "I need to use this now but I'll define it later" term using the Forward class (from the "forward" declaration in Pascal). Pyparsing 允许您使用Forward类(来自 Pascal 中的“forward”声明)定义“我现在需要使用它,但稍后我会定义它”术语。

pred = pp.Forward()

Then translating this to pyparsing (working bottom-up) looks like:然后将其转换为 pyparsing(自下而上)如下所示:

identifier = pp.Word(pp.alphas)
argument = pred | identifier
args = pp.delimitedList(argument)

And then to insert the definition into the existing pred term, use the <<= operator instead of = :然后要将定义插入现有的pred术语,请使用<<=运算符而不是=

pred <<= identifier + pp.Suppress("(") + args + pp.Suppress(")")

I got so tired of writing test loops for test inputs to parsers that I wrote the run_tests() method, which you would call like this:我厌倦了为解析器的测试输入编写测试循环,所以我编写了 run_tests() 方法,你可以这样调用它:

pred.run_tests("""\
    func(func(x), y)
    """)

Which echoes the input string and then dumps the output:它回显输入字符串,然后转储输出:

func(func(x), y)
['func', 'func', 'x', 'y']

So a couple more notes:所以还有一些注意事项:

  1. To make the arguments optional, wrap in a pp.Opt()要使参数可选,请包装在 pp.Opt()

  2. To preserve structure, wrap logical groups of terms in pp.Group.为了保持结构,在 pp.Group 中包装术语的逻辑组。 I would suggest wrapping both pred and the '('+args+')' part of pred :我建议包装predpred的 '('+args+')' 部分:

     pred <<= pp.Group(identifier + pp.Group(pp.Suppress("(") + args + pp.Suppress(")")))

Which will then give this run_tests output:然后会给出这个 run_tests 输出:

func(func(x), y)
[['func', [['func', ['x']], 'y']]]
[0]:
  ['func', [['func', ['x']], 'y']]
  [0]:
    func
  [1]:
    [['func', ['x']], 'y']
    [0]:
      ['func', ['x']]
      [0]:
        func
      [1]:
        ['x']
    [1]:
      y

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM