简体   繁体   English

在pyparsing中根据语法生成字符串

[英]generate string as per grammar in pyparsing

I have a grammar defined in pyparsing我在pyparsing中定义了一个语法

OCB, CCB, SQ = map(Suppress, "{}'")
name = SQ + Word(printables, excludeChars="{},'") + SQ
_name = CaselessKeyword("name").suppress()
_interface = Keyword("Component").suppress()
interface = Group(_interface + OCB + _name + name("interface_name") + CCB)
system = OneOrMore(interface + Optional(",").suppress())("interfaces")

If I have an input string:如果我有一个输入字符串:

model = "Component { name '/comp1' }, 
         Component { name '/comp2' }"
result = system.parseString(model)
print(result.dump())

The parsing result is as expected:解析结果如预期:

[['/comp1'], ['/comp2']]
- interfaces: [['/comp1'], ['/comp2']]
  [0]:
    ['/comp1']
    - interface_name: ['/comp1']
  [1]:
    ['/comp2']
    - interface_name: ['/comp2']

I want to know if there is a way to generate a string based on the grammar mentioned above.我想知道是否有办法根据上面提到的语法生成字符串。 Since the only "variables" are comp1 and comp2 , I need a function that generates the text:由于唯一的“变量”是comp1comp2 ,我需要一个 function 来生成文本:

def generate_string(comps: list):
    # do something
    return result

And the result of generate_string (maybe using originalTextFor ?) should be: generate_string的结果(也许使用originalTextFor ?)应该是:

"Component { name '/comp1' }, Component { name '/comp2' }"

I have seen examples to edit and to manually insert in ParseResults .我已经看到了要编辑手动插入ParseResults的示例。 But they don't make use of the grammar但他们不使用语法

I understand the question so that you want to produce strings from the parsing results.我理解这个问题,所以您想从解析结果中生成字符串。 If this is what you want, I have an interesting solution.如果这是您想要的,我有一个有趣的解决方案。

from pyparsing import *

def to_string_parse_action(to_string_function):
  def parse_action(s, loc, toks):
    return toks, lambda : to_string_function(toks)
  return parse_action

OCB, CCB, SQ = map(Suppress, "{}'")
name = SQ + Word(printables, excludeChars="{},'") + SQ
_name = CaselessKeyword("name").suppress()
_interface = Keyword("Component").suppress()
interface = Group(_interface + OCB + _name + name + CCB)
system = OneOrMore(interface + Optional(",").suppress())("interfaces")

def name_to_string(toks):
  return "name '" + toks[0] + "'"

def interface_to_string(toks):
  result = "Component { " + toks[0][0][1]() + " }"
  return result
  
def system_to_string(toks):
  result = ""
  first=True
  for interface, interface_to_string_function in toks:
    if not(first):
     result = result + ", "
    first = False
    result = result + interface_to_string_function()
  return result

name.setParseAction(to_string_parse_action(name_to_string))
interface.setParseAction(to_string_parse_action(interface_to_string))
system.setParseAction(to_string_parse_action(system_to_string))

model = "Component { name '/comp1' }, \
         Component { name '/comp2' }"
         
results = system.parseString(model)

for result, result_to_string in results:
  print(result_to_string())

I cannot image for what this could be good.我无法想象这会是什么好事。 But it was an interesting exercise for me.但这对我来说是一个有趣的练习。 And it works.它有效。 If you parse something and then you call result_to_string, a string is reproduced back.如果您解析某些内容然后调用 result_to_string,则会重新生成一个字符串。

The idea is to pass functions through the whole parsing process, which create the strings.这个想法是通过创建字符串的整个解析过程传递函数。 At the end of the parsing you have one to_string function for the parsing result.在解析结束时,您有一个 to_string function 用于解析结果。 You call it and it calls recursively the integrated to_string functions.您调用它,它会递归调用集成的 to_string 函数。

The parse action normally have the purpose to create the tokens.解析操作通常具有创建标记的目的。 In my version it has the purpose to create the tokens together with the function that produces a string.在我的版本中,它的目的是与生成字符串的 function 一起创建令牌。

You only have to write a to_string function and call to_string_parse_action when setting the parse action.您只需要在设置解析操作时编写一个to_string function 并调用to_string_parse_action即可。 The rest is nearly unchanged from your code. rest 与您的代码几乎没有变化。

Strangely it does not work with results names.奇怪的是,它不适用于结果名称。 Therefore I have removed this from your example.因此,我已将其从您的示例中删除。 But it should be possible to have this mechanism also with results names.但是应该可以将这种机制也与结果名称一起使用。 Sorry, I found no solution for this.抱歉,我没有找到解决方案。

One other drawback is that the parsing results become more complicated because now they contain all these functions.另一个缺点是解析结果变得更加复杂,因为现在它们包含所有这些函数。 I assume you want to do all this because you want to create artificially parsing results and then let them generate strings.我假设您想要执行所有这些操作,因为您想要创建人工解析结果,然后让它们生成字符串。 If this is what you want, you must now also produce the to_string functions, which makes everything a bit more effort.如果这是您想要的,您现在还必须生成 to_string 函数,这使得一切都更加努力。 But there should be plenty of alternatives how to do it differently, eg by using classes with appropriate constructors or by using a naming convention.但是应该有很多替代方法来做不同的事情,例如通过使用具有适当构造函数的类或使用命名约定。 I only illustrate the idea here.我在这里只说明这个想法。

If you want, you can remove the lambda and the toks argument in to_string_parse_action .如果需要,可以删除 lambda 和to_string_parse_action中的 toks 参数。 Then you have to call that function explicitly with toks everywhere.然后你必须在任何地方都明确地调用 function 。 For example the very last line would then be例如,最后一行将是

print(result_to_string(result))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM