[英]Python's pyparsing: Implementing grammar to parse logical AND expression
I am trying to parse and evaluate expressions, given to me as input from a file, of the form: 我试图解析和评估表达式,给我作为文件的输入,形式:
var[3] = 0 and var[2] = 1
var[0] = 1 and var[2] = 0 and var[3] = 1
...
(actually I also allow "multibit access" (ie var[X:Y]
) but let's ignore it for now...) (实际上我也允许“多位访问”(即
var[X:Y]
)但是现在让我们忽略它...)
Where var is an integer, and the []
indicates bit access. 其中var是整数,
[]
表示位访问。
For example, for var = 0x9
, the first expression above should be evaluated to False
, and the second should be evaluated to True
since 0x9 = b1001
. 例如,对于
var = 0x9
,上面的第一个表达式应该被计算为False
,而第二个表达式应该被计算为True
因为0x9 = b1001
。
and
and =
are the only binary operators I allow, and for the =
operator, the left operand is always var[X]
and the right operand is always a number. and
=
是我允许的唯一二元运算符,对于=
运算符,左操作数总是var[X]
,右操作数总是一个数字。
I tried to look around a bit and found that this could be achieved with Python's pyparsing
, but I ran into some difficulties trying to implement it. 我试着环顾四周,发现这可以通过Python的
pyparsing
来实现,但是我在尝试实现它时遇到了一些困难。
Here's what I've tried so far, based roughly on this example (which is one of many examples provided here ): 这是我到目前为止所尝试的内容,大致基于此示例 (这是此处提供的众多示例之一):
#!/usr/bin/env python
from pyparsing import Word, alphas, nums, infixNotation, opAssoc
class BoolAnd():
def __init__(self, pattern):
self.args = pattern[0][0::2]
def __bool__(self):
return all(bool(a) for a in self.args)
__nonzero__ = __bool__
class BoolEqual():
def __init__(self, pattern):
self.bit_offset = int(pattern[0][1])
self.value = int(pattern[0][-1])
def __bool__(self):
return True if (0xf >> self.bit_offset) & 0x1 == self.value else False # for now, let's assume var == 0xf
__nonzero__ = __bool__
variable_name = 'var'
bit_access = variable_name + '[' + Word(nums) + ']'
multibit_access = variable_name + '[' + Word(nums) + ':' + Word(nums) + ']'
value = Word(nums)
operand = bit_access | multibit_access | value
expression = infixNotation(operand,
[
('=', 2, opAssoc.LEFT, BoolEqual),
('AND', 2, opAssoc.LEFT, BoolAnd),
])
p = expression.parseString('var[3] = 1 AND var[1] = 0', True)
print 'SUCCESS' if bool(p) else 'FAIL'
I have three problems I need help with. 我有三个需要帮助的问题。
var[X:Y] = Z
, how do I enforce that: var[X:Y] = Z
形式的多位访问,我该如何强制执行: X > Y
Z < 2^{X - Y + 1}
var[X] = Y
, I can enforce by the grammar that Y
will be either 0
or 1
, and this will cause the expression.parseString()
to fail with exception if Y != 0/1
). var[X] = Y
,我可以通过语法强制执行Y
将为0
或1
,这将导致如果Y != 0/1
, expression.parseString()
失败,异常。 SUCCESS
? SUCCESS
? What am I doing wrong? var[3] = 1 AND var[1] = 0
it should be print FAIL
(you can see in my example that I hardcoded var
to be 0xf
, so var[3] = 1
is True
but var[1] = 0
is False
). var[3] = 1 AND var[1] = 0
它应该是打印FAIL
(你可以在我的例子中看到我硬编码var
为0xf
,所以var[3] = 1
是True
但是var[1] = 0
为False
)。 var
is not a class member of BoolEqual
nor is it global... is there a way to somehow send it to BoolEqual
's __init__
function? var
不是BoolEqual
的类成员,也不是全局的...有没有办法以某种方式将它发送到BoolEqual
的__init__
函数? Before getting to the problem solving, I suggest some minor changes to your grammar, primarily the inclusion of results names. 在解决问题之前,我建议对语法进行一些小的改动,主要是包含结果名称。 Adding these names will make your resulting code much cleaner and robust.
添加这些名称将使您的结果代码更加清晰和健壮。 I'm also using some of the expressions that were added in recent pyparsing versions, in the
pyparsing_common
namespace class: 我还使用了
pyparsing_common
命名空间类中最近的pyparsing版本中添加的一些表达式:
from pyparsing import pyparsing_common
variable_name = pyparsing_common.identifier.copy()
integer = pyparsing_common.integer.copy()
bit_access = variable_name('name') + '[' + integer('bit') + ']'
multibit_access = variable_name('name') + '[' + integer('start_bit') + ':' + integer('end_bit') + ']'
Part 1a: Enforcing valid values in "var[X:Y]" 第1a部分:在“var [X:Y]”中强制执行有效值
This kind of work is best done using parse actions and conditions. 这种工作最好使用解析操作和条件来完成。 Parse actions are parse-time callbacks that you can attach to your pyparsing expressions to modify, enhance, filter the results, or raise an exception if a validation rule fails.
解析操作是解析时回调,您可以将其附加到您的pyparsing表达式,以修改,增强,过滤结果或在验证规则失败时引发异常。 These are attached using the method:
使用以下方法附加这些:
expr.addParseAction(parse_action_fn)
And parse_action_fn can have any of the following signatures: 并且parse_action_fn可以具有以下任何签名:
def parse_action_fn(parse_string, parse_location, matched_tokens):
def parse_action_fn(parse_location, matched_tokens):
def parse_action_fn(matched_tokens):
def parse_action_fn():
(See more at https://pythonhosted.org/pyparsing/pyparsing.ParserElement-class.html#addParseActio)n ) (更多信息,请访问https://pythonhosted.org/pyparsing/pyparsing.ParserElement-class.html#addParseActio)n )
Parse actions can return None, return new tokens, modify the given tokens, or raise an exception. 解析操作可以返回None,返回新标记,修改给定标记或引发异常。
If all the parse action does is evaluate some condition based on the input tokens, you can write it as a simple function that returns True or False, and pyparsing will raise an exception if False is returned. 如果所有解析操作都根据输入标记评估某些条件,则可以将其写为返回True或False的简单函数,如果返回False,则pyparsing将引发异常。 In your case, your first validation rule could be implemented as:
在您的情况下,您的第一个验证规则可以实现为:
def validate_multibit(tokens):
return tokens.end_bit > tokens.start_bit
multibit_access.addCondition(validate_multibit,
message="start bit must be less than end bit",
fatal=True)
Or even just as a Python lambda function: 或者甚至只是作为Python lambda函数:
multibit_access.addCondition(lambda t: t.end_bit > t.start_bit,
message="start bit must be less than end bit",
fatal=True)
Now you can try this with: 现在你可以尝试这个:
multibit_access.parseString("var[3:0]")
And you will get this exception: 你会得到这个例外:
pyparsing.ParseFatalException: start bit must be less than end bit (at char 0), (line:1, col:1)
Part 1b: Enforcing valid values in "var[X:Y] = Z" 第1b部分:在“var [X:Y] = Z”中强制执行有效值
Your second validation rule deals not only with the var bit ranges but also the value it is being compared to. 您的第二个验证规则不仅处理var位范围,还处理它与之比较的值。 This will require a parse action that is attached to the complete BoolEqual.
这将需要附加到完整BoolEqual的解析操作。 We could put this in BoolEqual's
__init__
method, but I prefer to separate independent functions when possible. 我们可以把它放在BoolEqual的
__init__
方法中,但我更喜欢在可能的情况下分离独立的函数。 And since we will be adding our validation by attaching to the infixNotation
level, and infixNotation
only accepts parse actions, we will need to write your second validation rule as a parse action that raises an exception. 由于我们将通过附加到
infixNotation
级别来添加我们的验证,并且infixNotation
仅接受解析操作,因此我们需要将您的第二个验证规则编写为引发异常的解析操作。 (We will also use a new feature that was only recently released in pyparsing 2.2.0, attaching multiple parse actions at a level in infixNotation
.) (我们也将使用一个新功能,最近才在pyparsing 2.2.0发布,在一个水平附加多个解析动作
infixNotation
。)
Here is the validation that we will wish to perform: 以下是我们希望执行的验证:
if multibit expression var[X:Y], value must be < 2**(Y-X+1) 如果多位表达式var [X:Y],则值必须<2 **(Y-X + 1)
def validate_equality_args(tokens):def validate_equality_args(tokens):\n tokens = tokens[0]
tokens = tokens [0]\n z = tokens[-1]
z =代币[-1]\n if 'bit' in tokens:
如果令牌中有'位':\n if z not in (0,1):
如果z不在(0,1)中:\n raise ParseFatalException("invalid equality value - must be 0 or 1")
引发ParseFatalException(“无效的相等值 - 必须为0或1”)\n else:
其他:\n x = tokens.start_bit
x = tokens.start_bit\n y = tokens.end_bit
y = tokens.end_bit\n if not z < 2**(y - x + 1):
如果不是z <2 **(y - x + 1):\n raise ParseFatalException("invalid equality value")
引发ParseFatalException(“无效的相等值”)
And we attach this parse action to infixNotation
using: 我们使用以下命令将此解析操作附加到
infixNotation
:
expression = infixNotation(operand,
[
('=', 2, opAssoc.LEFT, (validate_equality_args, BoolEqual)),
('AND', 2, opAssoc.LEFT, BoolAnd),
])
Part 3: Supporting other var names and values than 0xf 第3部分:支持除0xf之外的其他var名称和值
To deal with vars of various names, you can add a class-level dict to BoolEqual: 要处理各种名称的变量,可以向BoolEqual添加类级别的dict:
class BoolEqual():
var_names = {}
and set this ahead of time: 并提前设定:
BoolEqual.var_names['var'] = 0xf
And then implement your __bool__
method as just: 然后将
__bool__
方法实现为:
return (self.var_names[self.var_name] >> self.bit_offset) & 0x1 == self.value
(This will need to be extended to support multibit, but the general idea is the same.) (这需要扩展到支持多位,但总体思路是一样的。)
How about converting the variable to a list of 1 and 0's and using eval
to evaluate the boolean expressions (with a small modification, changing = into ==) : 如何将变量转换为1和0的列表,并使用
eval
来计算布尔表达式(通过一个小的修改,更改= into ==):
def parse(lines, v):
var = map(int,list(bin(v)[2:]))
result = []
for l in lines:
l = l.replace('=','==')
result.append(eval(l))
return result
inp = \
"""
var[3] = 0 and var[2] = 1
var[0] = 1 and var[2] = 0 and var[3] = 1
"""
lines = inp.split('\n')[1:-1]
v = 0x09
print parse(lines, v)
Output: 输出:
[False, True]
Note that you should only use eval
if your trust the input. 请注意,如果您信任输入,则只应使用
eval
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.