简体   繁体   English

使用 PyParsing 将布尔表达式拆分为所有可能性

[英]Split a Boolean expression into all possibilities with PyParsing

I am creating a program which is filtering an excel document.我正在创建一个过滤 excel 文档的程序。 In some Cells, there are combinations of codes separated by Boolean expressions.在某些单元格中,存在由布尔表达式分隔的代码组​​合。 I need to split these into every possibility, and I am looking at achieving this via PyParsing.我需要将这些分解成每一种可能性,我正在考虑通过 PyParsing 实现这一点。

For example, if I have the following:例如,如果我有以下内容:

single = "347SJ"
single_or = "456NG | 347SJ"
and_or = "347SJ & (347SJ | 383DF)"
and_multi_or = "373FU & (383VF | 321AC | 383ZX | 842UQ)"

I want to end up with:我想结束:

single = "347SJ"
single_or1 = "456NG"
single_or2 = "347SJ"
and_or1 = "347SJ & 347SJ"
and_or2 = "347SJ & 383DF"
and_multi_or1 = "373FU & 383VF"
and_multi_or2 = "373FU & 321AC"
and_multi_or3 = "373FU & 383ZX"
and_multi_or4 = "373FU & 842UQ"

I feel like it's very simple but I can't find anything similar to this online, can anyone help?我觉得这很简单,但我在网上找不到类似的东西,有人可以帮忙吗?

Here is a start, a parser that will process those inputs as & and |这是一个开始,一个解析器会将这些输入处理为 & 和 | operations:操作:

import pyparsing as pp

operand = pp.Word(pp.alphanums)

bool_expr = pp.infix_notation(operand, [
    ('&', 2, pp.opAssoc.LEFT),
    ('|', 2, pp.opAssoc.LEFT),
])

tests = [single, single_or, and_or, and_multi_or]

bool_expr.run_tests(tests)

prints印刷

347SJ    
['347SJ']

456NG | 347SJ
[['456NG', '|', '347SJ']]
[0]:
  ['456NG', '|', '347SJ']

347SJ & (347SJ | 383DF)
[['347SJ', '&', ['347SJ', '|', '383DF']]]
[0]:
  ['347SJ', '&', ['347SJ', '|', '383DF']]
  [2]:
    ['347SJ', '|', '383DF']

373FU & (383VF | 321AC | 383ZX | 842UQ)
[['373FU', '&', ['383VF', '|', '321AC', '|', '383ZX', '|', '842UQ']]]
[0]:
  ['373FU', '&', ['383VF', '|', '321AC', '|', '383ZX', '|', '842UQ']]
  [2]:
    ['383VF', '|', '321AC', '|', '383ZX', '|', '842UQ']

So that was the simple part.这就是简单的部分。 The next step is to traverse that structure and generate the various paths through the logical operators.下一步是遍历该结构并通过逻辑运算符生成各种路径。 This part gets kind of hairy.这部分有点毛茸茸的。

PART 2 -- STOP HERE IF YOU WANT TO SOLVE THE REST YOURSELF!!!第 2 部分——如果您想自己解决剩下的问题,请停在这里!!!

(Much code taken from the invRegex example , which inverts a regex by generating sample strings that match it.) (许多代码取自invRegex 示例,它通过生成与之匹配的示例字符串来反转正则表达式。)

Rather than extract the structure and then walk it, we can use pyparsing parse actions to add active nodes at each level that will then give us generators to generate each bit of the expression.与其提取结构然后遍历它,我们可以使用 pyparsing 解析操作在每个级别添加活动节点,然后为我们提供生成器来生成表达式的每一位。

class GroupEmitter:
    def __init__(self, exprs):
        # drop the operators, take the operands
        self.exprs = exprs[0][::2]

    def makeGenerator(self):
        def groupGen():
            def recurseList(elist):
                if len(elist) == 1:
                    if isinstance(elist[0], pp.ParseResults):
                        yield from recurseList(elist[0])
                    else:
                        yield from elist[0].makeGenerator()()
                else:
                    for s in elist[0].makeGenerator()():
                        for s2 in recurseList(elist[1:]):
                            # this could certainly be improved
                            if isinstance(s, str):
                                if isinstance(s2, str):
                                    yield [s, s2]
                                else:
                                    yield [s, *s2]
                            else:
                                if isinstance(s2, str):
                                    yield [*s, s2]
                                else:
                                    yield [*s, *s2]

            if self.exprs:
                yield from recurseList(self.exprs)

        return groupGen


class AlternativeEmitter:
    def __init__(self, exprs):
        # drop the operators, take the operands
        self.exprs = exprs[0][::2]

    def makeGenerator(self):
        def altGen():
            for e in self.exprs:
                yield from e.makeGenerator()()

        return altGen


class LiteralEmitter:
    def __init__(self, lit):
        self.lit = lit[0]

    def __str__(self):
        return "Lit:" + self.lit

    def __repr__(self):
        return "Lit:" + self.lit

    def makeGenerator(self):
        def litGen():
            yield [self.lit]

        return litGen


operand.add_parse_action(LiteralEmitter)
bool_expr = pp.infix_notation(operand, [
    ('&', 2, pp.opAssoc.LEFT, GroupEmitter),
    ('|', 2, pp.opAssoc.LEFT, AlternativeEmitter),
])


def invert(expr):
    return bool_expr.parse_string(expr)[0].makeGenerator()()


for t in tests:
    print(t)
    print(list(invert(t)))
    print("\n".join(' & '.join(tt) for tt in invert(t)))
    print()

prints印刷

347SJ
[['347SJ']]
347SJ

456NG | 347SJ
[['456NG'], ['347SJ']]
456NG
347SJ

347SJ & (347SJ | 383DF)
[['347SJ', '347SJ'], ['347SJ', '383DF']]
347SJ & 347SJ
347SJ & 383DF

373FU & (383VF | 321AC | 383ZX | 842UQ)
[['373FU', '383VF'], ['373FU', '321AC'], ['373FU', '383ZX'], ['373FU', '842UQ']]
373FU & 383VF
373FU & 321AC
373FU & 383ZX
373FU & 842UQ

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM