简体   繁体   English

Python的pyparsing:实现语法来解析逻辑AND表达式

[英]Python's pyparsing: Implementing grammar to parse logical AND expression

I am trying to parse and evaluate expressions, given to me as input from a file, of the form: 我试图解析和评估表达式,给我作为文件的输入,形式:

var[3] = 0 and var[2] = 1
var[0] = 1 and var[2] = 0 and var[3] = 1
...

(actually I also allow "multibit access" (ie var[X:Y] ) but let's ignore it for now...) (实际上我也允许“多位访问”(即var[X:Y] )但是现在让我们忽略它...)
Where var is an integer, and the [] indicates bit access. 其中var是整数, []表示位访问。
For example, for var = 0x9 , the first expression above should be evaluated to False , and the second should be evaluated to True since 0x9 = b1001 . 例如,对于var = 0x9 ,上面的第一个表达式应该被计算为False ,而第二个表达式应该被计算为True因为0x9 = b1001
and and = are the only binary operators I allow, and for the = operator, the left operand is always var[X] and the right operand is always a number. and =是我允许的唯一二元运算符,对于=运算符,左操作数总是var[X] ,右操作数总是一个数字。
I tried to look around a bit and found that this could be achieved with Python's pyparsing , but I ran into some difficulties trying to implement it. 我试着环顾四周,发现这可以通过Python的pyparsing来实现,但是我在尝试实现它时遇到了一些困难。
Here's what I've tried so far, based roughly on this example (which is one of many examples provided here ): 这是我到目前为止所尝试的内容,大致基于此示例 (这是此处提供的众多示例之一):

#!/usr/bin/env python
from pyparsing import Word, alphas, nums, infixNotation, opAssoc

class BoolAnd():
    def __init__(self, pattern):
        self.args = pattern[0][0::2]

    def __bool__(self):
        return all(bool(a) for a in self.args)

    __nonzero__ = __bool__


class BoolEqual():
    def __init__(self, pattern):
        self.bit_offset = int(pattern[0][1])
        self.value = int(pattern[0][-1])

    def __bool__(self):
        return True if (0xf >> self.bit_offset) & 0x1 == self.value else False # for now, let's assume var == 0xf

    __nonzero__ = __bool__




variable_name   = 'var'
bit_access      = variable_name + '[' + Word(nums) + ']'
multibit_access = variable_name + '[' + Word(nums) + ':' + Word(nums) + ']'
value = Word(nums)

operand = bit_access | multibit_access | value

expression = infixNotation(operand,
    [
    ('=',   2, opAssoc.LEFT,  BoolEqual),
    ('AND', 2, opAssoc.LEFT,  BoolAnd),
    ])


p = expression.parseString('var[3] = 1 AND var[1] = 0', True)

print 'SUCCESS' if bool(p) else 'FAIL'

I have three problems I need help with. 我有三个需要帮助的问题。

  1. For multibit access of the form var[X:Y] = Z , how do I enforce that: 对于var[X:Y] = Z形式的多位访问,我该如何强制执行:
    a. 一个。 X > Y
    b. Z < 2^{X - Y + 1}
    I assume this can't be enforced by the grammar itself (for example, for single-bit access of the form var[X] = Y , I can enforce by the grammar that Y will be either 0 or 1 , and this will cause the expression.parseString() to fail with exception if Y != 0/1 ). 我假设这不能由语法本身强制执行(例如,对于形式var[X] = Y ,我可以通过语法强制执行Y将为01 ,这将导致如果Y != 0/1expression.parseString()失败,异常。
  2. Most importantly: Why does it always print SUCCESS ? 最重要的是:为什么它总是打印SUCCESS What am I doing wrong? 我究竟做错了什么?
    For the input var[3] = 1 AND var[1] = 0 it should be print FAIL (you can see in my example that I hardcoded var to be 0xf , so var[3] = 1 is True but var[1] = 0 is False ). 对于输入var[3] = 1 AND var[1] = 0它应该是打印FAIL (你可以在我的例子中看到我硬编码var0xf ,所以var[3] = 1True但是var[1] = 0False )。
  3. Which brings me to my third problem: var is not a class member of BoolEqual nor is it global... is there a way to somehow send it to BoolEqual 's __init__ function? 这让我想到了第三个问题: var不是BoolEqual的类成员,也不是全局的...有没有办法以某种方式将它发送到BoolEqual__init__函数?

Before getting to the problem solving, I suggest some minor changes to your grammar, primarily the inclusion of results names. 在解决问题之前,我建议对语法进行一些小的改动,主要是包含结果名称。 Adding these names will make your resulting code much cleaner and robust. 添加这些名称将使您的结果代码更加清晰和健壮。 I'm also using some of the expressions that were added in recent pyparsing versions, in the pyparsing_common namespace class: 我还使用了pyparsing_common命名空间类中最近的pyparsing版本中添加的一些表达式:

from pyparsing import pyparsing_common

variable_name   = pyparsing_common.identifier.copy()
integer = pyparsing_common.integer.copy()
bit_access      = variable_name('name') + '[' + integer('bit') + ']'
multibit_access = variable_name('name') + '[' + integer('start_bit') + ':' + integer('end_bit') + ']'

Part 1a: Enforcing valid values in "var[X:Y]" 第1a部分:在“var [X:Y]”中强制执行有效值

This kind of work is best done using parse actions and conditions. 这种工作最好使用解析操作和条件来完成。 Parse actions are parse-time callbacks that you can attach to your pyparsing expressions to modify, enhance, filter the results, or raise an exception if a validation rule fails. 解析操作是解析时回调,您可以将其附加到您的pyparsing表达式,以修改,增强,过滤结果或在验证规则失败时引发异常。 These are attached using the method: 使用以下方法附加这些:

expr.addParseAction(parse_action_fn)

And parse_action_fn can have any of the following signatures: 并且parse_action_fn可以具有以下任何签名:

def parse_action_fn(parse_string, parse_location, matched_tokens):
def parse_action_fn(parse_location, matched_tokens):
def parse_action_fn(matched_tokens):
def parse_action_fn():

(See more at https://pythonhosted.org/pyparsing/pyparsing.ParserElement-class.html#addParseActio)n ) (更多信息,请访问https://pythonhosted.org/pyparsing/pyparsing.ParserElement-class.html#addParseActio)n

Parse actions can return None, return new tokens, modify the given tokens, or raise an exception. 解析操作可以返回None,返回新标记,修改给定标记或引发异常。

If all the parse action does is evaluate some condition based on the input tokens, you can write it as a simple function that returns True or False, and pyparsing will raise an exception if False is returned. 如果所有解析操作都根据输入标记评估某些条件,则可以将其写为返回True或False的简单函数,如果返回False,则pyparsing将引发异常。 In your case, your first validation rule could be implemented as: 在您的情况下,您的第一个验证规则可以实现为:

def validate_multibit(tokens):
    return tokens.end_bit > tokens.start_bit
multibit_access.addCondition(validate_multibit,
                            message="start bit must be less than end bit", 
                            fatal=True)

Or even just as a Python lambda function: 或者甚至只是作为Python lambda函数:

multibit_access.addCondition(lambda t: t.end_bit > t.start_bit, 
                            message="start bit must be less than end bit", 
                            fatal=True)

Now you can try this with: 现在你可以尝试这个:

multibit_access.parseString("var[3:0]")

And you will get this exception: 你会得到这个例外:

pyparsing.ParseFatalException: start bit must be less than end bit (at char 0), (line:1, col:1)

Part 1b: Enforcing valid values in "var[X:Y] = Z" 第1b部分:在“var [X:Y] = Z”中强制执行有效值

Your second validation rule deals not only with the var bit ranges but also the value it is being compared to. 您的第二个验证规则不仅处理var位范围,还处理它与之比较的值。 This will require a parse action that is attached to the complete BoolEqual. 这将需要附加到完整BoolEqual的解析操作。 We could put this in BoolEqual's __init__ method, but I prefer to separate independent functions when possible. 我们可以把它放在BoolEqual的__init__方法中,但我更喜欢在可能的情况下分离独立的函数。 And since we will be adding our validation by attaching to the infixNotation level, and infixNotation only accepts parse actions, we will need to write your second validation rule as a parse action that raises an exception. 由于我们将通过附加到infixNotation级别来添加我们的验证,并且infixNotation仅接受解析操作,因此我们需要将您的第二个验证规则编写为引发异常的解析操作。 (We will also use a new feature that was only recently released in pyparsing 2.2.0, attaching multiple parse actions at a level in infixNotation .) (我们也将使用一个新功能,最近才在pyparsing 2.2.0发布,在一个水平附加多个解析动作infixNotation 。)

Here is the validation that we will wish to perform: 以下是我们希望执行的验证:

  • if single bit expression, value must be 0 or 1 如果是单个位表达式,则值必须为0或1
  • if multibit expression var[X:Y], value must be < 2**(Y-X+1) 如果多位表达式var [X:Y],则值必须<2 **(Y-X + 1)

     def validate_equality_args(tokens): def validate_equality_args(tokens):\n    tokens = tokens[0] tokens = tokens [0]\n    z = tokens[-1] z =代币[-1]\n    if 'bit' in tokens: 如果令牌中有'位':\n        if z not in (0,1): 如果z不在(0,1)中:\n            raise ParseFatalException("invalid equality value - must be 0 or 1") 引发ParseFatalException(“无效的相等值 - 必须为0或1”)\n    else: 其他:\n        x = tokens.start_bit x = tokens.start_bit\n        y = tokens.end_bit y = tokens.end_bit\n        if not z < 2**(y - x + 1): 如果不是z <2 **(y  -  x + 1):\n            raise ParseFatalException("invalid equality value") 引发ParseFatalException(“无效的相等值”) 

And we attach this parse action to infixNotation using: 我们使用以下命令将此解析操作附加到infixNotation

expression = infixNotation(operand,
    [
    ('=',   2, opAssoc.LEFT,  (validate_equality_args, BoolEqual)),
    ('AND', 2, opAssoc.LEFT,  BoolAnd),
    ])

Part 3: Supporting other var names and values than 0xf 第3部分:支持除0xf之外的其他var名称和值

To deal with vars of various names, you can add a class-level dict to BoolEqual: 要处理各种名称的变量,可以向BoolEqual添加类级别的dict:

class BoolEqual():
    var_names = {}

and set this ahead of time: 并提前设定:

BoolEqual.var_names['var'] = 0xf

And then implement your __bool__ method as just: 然后将__bool__方法实现为:

return (self.var_names[self.var_name] >> self.bit_offset) & 0x1 == self.value

(This will need to be extended to support multibit, but the general idea is the same.) (这需要扩展到支持多位,但总体思路是一样的。)

How about converting the variable to a list of 1 and 0's and using eval to evaluate the boolean expressions (with a small modification, changing = into ==) : 如何将变量转换为1和0的列表,并使用eval来计算布尔表达式(通过一个小的修改,更改= into ==):

def parse(lines, v):
    var = map(int,list(bin(v)[2:]))
    result = []

    for l in lines:
        l = l.replace('=','==')
        result.append(eval(l))

    return result

inp = \
"""
var[3] = 0 and var[2] = 1
var[0] = 1 and var[2] = 0 and var[3] = 1
"""

lines = inp.split('\n')[1:-1]
v = 0x09

print parse(lines, v)

Output: 输出:

[False, True]

Note that you should only use eval if your trust the input. 请注意,如果您信任输入,则只应使用eval

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM