简体   繁体   English

比循环和调用循环和调用其他函数的函数更好的方法

[英]A better way than looping and calling functions that loop and call another functions

I have a message (string) that is composed of transactions that is composed of groups that is composed of elements . 我有一个消息 (字符串),它由由元素组成的组成的事务组成。

I there a better way to parse such message than looping and calling functions that loop and call another functions that loop and call another functions because I find the following is silly: 我有一个更好的方法来解析这样的消息,而不是循环和调用 循环和调用另一个 循环和调用另一个函数的函数的函数,因为我发现以下是愚蠢的:

class Parser:
  def parse_msg(self, msg):
    trans = re.findall(trans_pattern, msg)
    for t in trans:
      self.parse_trans(t)

  def parse_trans(self, trans):
    groups = re.findall(groups_pattern, trans)
    for g in groups:
      self.parse_group(g)

  def parse_group(self, group):
    elements = re.findall(element_pattern, group)
    for e in elements:
      self.parse_element(e)

  def parse_element(self, e):
    pass

Is there a better way/ design-pattern that I can approach this with? 有没有更好的方法/ 设计模式 ,我可以接近这个?

Well, I guess there are several possibilities. 好吧,我想有几种可能性。 You could have some structure like the following: 您可以使用以下结构:

import re

GRAMMAR = (
    trans_pattern, (
        groups_pattern, (
            element_pattern, None
        )
    )
)

def parse_message(msg):
    parse_message_rec(msg, GRAMMAR)

def parse_message_rec(msg, grammar):
    if grammar is None:
        # Leaf element
        return
    pattern, next_grammar = grammar
    children = re.findall(pattern, msg)
    for child in children:
        parse_message_rec(child, next_grammar)

That method definitely sounds intensive since every bit of text will be gone over multiple times. 这种方法肯定听起来很密集,因为每一段文字都会经过多次。 O(n^3) complexity or something. O(n ^ 3)复杂性或其他东西。

Instead, I would create a function to go through the input once and have it all parsed in one shot. 相反,我会创建一个函数来完成输入一次并让它一次性解析。 To do that, it sounds like there's a handy pyparsing module you can use (I've never used it myself so I'm not sure of the learning curve, difficulty, or optimization). 要做到这一点,听起来你可以使用一个方便的pyparsing模块(我自己从未使用它,所以我不确定学习曲线,难度或优化)。 Otherwise, to do it manually, you'd have to keep track of your current "depth" (trans, group, or element) and determine if you're closing or opening a trans/group/element at that depth while keeping track of the data between opening and closing expressions. 否则,要手动执行此操作,您必须跟踪当前的“深度”(反式,组或元素)并确定是否在该深度处关闭或打开反式/组/元素,同时跟踪开始和结束表达式之间的数据。 In short, don't find all "trans", just find where the first one begins, grab any unique data up until the next group begins, start the new group, grab unique data until the element begins, start the new element, grab data until it closes, see if there's another element or if the group closes, etc, etc. Not as concise, but certainly faster. 简而言之,找不到所有“trans”,只找到第一个开始的地方,抓住任何唯一数据直到下一组开始,启动新组,获取唯一数据直到元素开始,启动新元素,抓取数据,直到它关闭,看看是否有另一个元素或组是否关闭等等。不是简洁,但肯定更快。 If speed is not a concern, your method is fine. 如果速度不是问题,你的方法很好。 If it is (or will be) a concern, then you'll want to parse it in one pass. 如果它是(或将是)一个问题,那么你将需要一次解析它。

我命令您使用以下方法:将特殊格式转换为简单XML(使用正则表达式或您喜欢的格式),然后您可以应用任何XML模式/方法/库来解析文本。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM