I have a list that contains symbols and operators, like this:
[['B31', '+', 'W311', '*', ['B21', '+', 'W211', '*', ['B11', '+', 'W111', '*', 'x'], '+', 'W221', '*',
['B12', '+', 'W121', '*', 'x']], '+', 'W312', '*',
['B22', '+', 'W212', '*', ['B11', '+', 'W111', '*', 'x'], '+', 'W222', '*',
['B12', '+', 'W121', '*', 'x']]]]
I wish to group operators together with their operands in lists of 3 elements, here that would be
[['B31', '+',
['W311', '*',
['B21', '+',
[['W211', '*', [['B11', '+', ['W111', '*', 'x']]],
'+', ['W221', '*',
['B12', '+', ['W121', '*', 'x']]]]]]]],
'+', ['W312', '*',
['B22', '+',
[['W212', '*', ['B11', '+', ['W111', '*', 'x']]],
'+', ['W222', '*',
['B12', '+', ['W121', '*', 'x']]]]]]]]
My algorithm looks like this:
def group_by_symbol(formula: Union[List, str], symbol: str) -> List:
"""
Group multiplication in formula: a op b -> [a op b]
:param formula: contains operations not inside a list.
:return: operations enclosed in a list.
"""
modified_formula = formula
# loop backwards
for i in range(len(modified_formula) - 1, -1, -1):
if i > len(modified_formula) - 1:
continue
if modified_formula[i] == symbol:
# introduce parentheses around symbol
group = [modified_formula[i - 1], modified_formula[i], modified_formula[i + 1]]
del modified_formula[i:i + 2]
modified_formula[i - 1] = group
elif isinstance(modified_formula[i], List) \
and len(modified_formula[i]) > 3:
# recurse
modified_formula[i] = group_by_symbol(modified_formula[i], symbol)
return modified_formula
It is called like below:
grouped = group_by_symbol(formula, '*')
grouped = group_by_symbol(grouped, '+')
However, the case where there is more than one addition in the same list does not create the desired groups and the result I obtain is the following, where there occurs more than one + symbol in a list, and not all lists have a size of 3:
[[['B31', '+', [['W311', '*', ['B21', '+', ['W211', '*', ['B11', '+', ['W111', '*', 'x']]], '+',
['W221', '*', ['B12', '+', ['W121', '*', 'x']]]]],
'+', ['W312', '*',
['B22', '+', ['W212', '*', ['B11', '+', ['W111', '*', 'x']]], '+',
['W222', '*', ['B12', '+', ['W121', '*', 'x']]]]]]]]]
I suspect the error has something to do with an early exit from recursion, however, checking the sublist to contain only strings in the condition results in an endless recursion.
We can dramatically simplify the program by writing a pure function. The numbered comments here correspond to the the source code numbers in the program below.
op
, we have reached the base case. If the supplied argument, arg
, is a list, convert it to an expression or simply return the arg
.op
. If the supplied arg
is a list, we need to recursively convert it, too. Return a 3-part expression with expr(*arg)
, the op
, and the recursive result, expr(*more)
arg
is not a list. Return a 3-part expression with arg
, the op
, and the recursive result, expr(*more)
tree = \
[['B31','+','W311','*',['B21','+','W211','*',['B11','+','W111','*','x'],'+','W221','*',['B12','+','W121','*','x']],'+','W312','*',['B22','+','W212','*',['B11','+','W111','*','x'],'+','W222','*',['B12','+','W121','*','x']]]]
def expr(arg, op = None, *more):
if not op:
return expr(*arg) if isinstance(arg, list) else arg #1
elif isinstance(arg, list):
return [ expr(*arg), op, expr(*more) ] #2
else:
return [ arg, op, expr(*more) ] #3
print(expr(tree))
# ['B31', '+', ['W311', '*', [['B21', '+', ['W211', '*', [['B11', '+', ['W111', '*', 'x']], '+', ['W221', '*', ['B12', '+', ['W121', '*', 'x']]]]]], '+', ['W312', '*', ['B22', '+', ['W212', '*', [['B11', '+', ['W111', '*', 'x']], '+', ['W222', '*', ['B12', '+', ['W121', '*', 'x']]]]]]]]]]
Maybe we can verify the output a little better if we convert the expression to a string -
def expr_to_str(expr1, op, expr2):
return \
f"({expr_to_str(*expr1) if isinstance(expr1, list) else expr1} {op} {expr_to_str(*expr2) if isinstance(expr2, list) else expr2})"
print(expr_to_str(*expr(tree)))
# (B31 + (W311 * ((B21 + (W211 * ((B11 + (W111 * x)) + (W221 * (B12 + (W121 * x)))))) + (W312 * (B22 + (W212 * ((B11 + (W111 * x)) + (W222 * (B12 + (W121 * x))))))))))
Here's another way using a class
-
class expr:
def __init__(self, x, op = None, *y):
self.op = op
self.x = expr(*x) if isinstance(x, list) else x
self.y = expr(*y) if y else y
def __str__(self):
if not self.op:
return f"{self.x}"
else:
return f"({self.x} {self.op} {self.y})"
print(expr(tree))
# (B31 + (W311 * ((B21 + (W211 * ((B11 + (W111 * x)) + (W221 * (B12 + (W121 * x)))))) + (W312 * (B22 + (W212 * ((B11 + (W111 * x)) + (W222 * (B12 + (W121 * x))))))))))
varaidic support
In a comment you ask if the expr
can support 3-element results and 2-element results. Here is one such flexible implementation -
In the constructor, __init__
, we do a simple case analysis -
a
is a list and the list is less than 4
elements, we don't need to break anything down. Simply map expr
over each element of a
.a
is a list at least 4 elements, so we need to break it into smaller expressions. Construct an expression of the first element, expr(a[0])
, the second element, expr(a[1])
, and the recursive result of all remaining elements, expr(a[2::])
a
is not a list, ie it is a single item. Set the expression's data to the singleton, [ a ]
In the __str__
method, we do a similar analysis to convert our expression's data
into a string -
self.data
is empty, return the empty string, ""
self.data
is not empty. If it is less than 2 elements (singleton), return the singleton result, f"{self.data[0]}"
self.data
is at least 2 or more elements. return a (...)
-enclosed string where each part is recursively converted to a str
and joined with a space, " "
class expr:
def __init__(self, a):
if isinstance(a, list):
if len(a) < 4:
self.data = [ expr(x) for x in a ] #1
else:
self.data = [ expr(a[0]), expr(a[1]), expr(a[2::]) ] #2
else:
self.data = [ a ] #3
def __str__(self):
if not self.data:
return "" #1 empty
elif len(self.data) < 2:
return f"{self.data[0]}" #2 singleton
else:
return "(" + " ".join(str(x) for x in self.data) + ")" #3 variadic
print(expr(tree))
# (B31 + (W311 * ((B21 + (W211 * ((B11 + (W111 * x)) + (W221 * (B12 + (W121 * x)))))) + (W312 * (B22 + (W212 * ((B11 + (W111 * x)) + (W222 * (B12 + (W121 * x))))))))))
print(expr([[ "¬", ["a", "+", "b"]], "and", [["length", "x"], ">", 0]]))
# ((¬ (a + b)) and ((length x) > 0))
breaking it down
By decomposing a complex problem into smaller parts, it easier to solve the sub-problems and it affords us more flexibility and control. For what it's worth, this technique does not rely on Python's specific OOP mechanisms. These are ordinary, well-defined, pure functions -
def unit(): return ('unit',)
def nullary(op): return ('nullary', op)
def unary(op, a): return ('unary', op, a)
def binary(op, a, b): return ('binary', op, a, b)
Now using a flat case analysis as we've done before, we implement our recursive expression constructor expr
-
a
is not a list, it is a single value. construct a nullary
expression with a
unit
expression.nullary
expression with the only element, expr(a[0])
unary
expression with expr(a[0])
and expr(a[1])
is_infix
position, convert to prefix position. Construct a binary
expression with expr(a[0])
and expr(a[1])
in swapped position, and the recursive result expr(a[2::])
binary
expression of expr(a[0])
and expr(a[1])
and the recursive result expr(a[2::])
infix_ops = set([ '+', '-', '*', '/', '>', '<', 'and', 'or' ])
def is_infix (a):
return a[1] in infix_ops
def expr(a):
if not isinstance(a, list):
return nullary(a) #1
elif len(a) == 0:
return unit() #2
elif len(a) == 1:
return nullary(expr(a[0])) #3
elif len(a) == 2:
return unary(expr(a[0]), expr(a[1])) #4
elif is_infix(a):
return binary(expr(a[1]), expr(a[0]), expr(a[2::])) #5
else:
return binary(expr(a[0]), expr(a[1]), expr(a[2::])) #6
Now to see the result -
tree2 = \
[[ "¬", ["a", "+", "b"]], "and", [["length", "x"], ">", 0]]
print(expr(tree2))
# ('binary', ('nullary', 'and'), ('unary', ('nullary', '¬'), ('binary', ('nullary', '+'), ('nullary', 'a'), ('nullary', ('nullary', 'b')))), ('nullary', ('binary', ('nullary', '>'), ('unary', ('nullary', 'length'), ('nullary', 'x')), ('nullary', ('nullary', 0)))))
This is just one possible representation of our expressions. Because we implemented our expressions using tuple
, Python is able to print them out, despite being verbose. By contrast, here's how Python chooses to represent objects -
class foo: pass
f = foo()
print(f)
# <__main__.foo object at 0x7f2ba03bc8e0>
What's important here is that our expression data structure is well-defined and we can easily perform computations on it or represent it other ways -
def expr_to_str(m):
if not isinstance(m, tuple):
return str(m)
elif m[0] == "unit":
return ""
elif m[0] == "nullary":
return expr_to_str(m[1])
elif m[0] == "unary":
return f"({expr_to_str(m[1])} {expr_to_str(m[2])})"
elif m[0] == "binary":
return f"({expr_to_str(m[1])} {expr_to_str(m[2])} {expr_to_str(m[3])})"
else:
raise TypeError("invalid expression type", m[0])
print(expr_to_str(expr(tree2)))
# (and (¬ (+ a b)) (> (length x) 0))
evaluating an expression
So what if we wanted to evaluate one of our expressions?
m = expr([3, "+", 2, "*", 5, "-", 1])
print(expr_to_str(m))
# (+ 3 (* 2 (- 5 1)))
print(eval_expr(m))
# 11
You're just a few steps away from being able to write eval_expr
-
def eval_expr(m):
if not isinstance(m, tuple):
return m
elif m[0] == "unit":
return None
elif m[0] == "nullary":
return eval0(m[1])
elif m[0] == "unary":
return eval1(m[1], m[2])
elif m[0] == "binary":
return eval2(m[1], m[2], m[3])
else:
raise TypeError("invalid expression type", m[0])
See, complex problems are easier when breaking them down into small parts. Now we just write eval0
, eval1
, and eval2
-
def eval0(op):
return eval_expr(op)
def eval1(op, a):
if op == expr("not"): # or op == expr("¬") ...
return not eval_expr(a)
elif op == expr("neg"): # or op == expr("~") ...
return -eval_expr(a)
# +, ++, --, etc...
else:
raise ValueError("invalid op", op)
def eval2(op, a, b):
if op == expr("+"):
return eval_expr(a) + eval_expr(b)
elif op == expr("-"):
return eval_expr(a) - eval_expr(b)
elif op == expr("*"):
return eval_expr(a) * eval_expr(b)
elif op == expr("/"):
return eval_expr(a) / eval_expr(b)
elif op == expr("and"):
return eval_expr(a) and eval_expr(b)
# >, <, or, xor, etc...
else:
raise ValueError("invalid op", op)
Let's see a mixture of expressions now -
print(eval_expr(expr([True, 'and', ['not', False]])))
# True
print(eval_expr(expr(['neg', [9, '*', 11]])))
# -99
print(eval_expr(expr(['stay', '+', 'inside'])))
# 'stayinside'
You can even define your own functions -
def eval1(op, a):
# ...
elif op == expr('scream'):
return eval_expr(a).upper() # make uppercase!
else:
raise ValueError("invalid op", op)
And use them in your expressions -
print(eval_expr(expr(["scream", ["stay", "+", "inside"]])))
# 'STAYINSIDE'
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.