[英]Function interpreter in python
Suppose I have a function in Python that includes mathematical expressions from Python's base and some mathematical expressions from Numpy and Scipy, including maybe some distributions.假设我在 Python 中有一个 function,其中包括来自 Python 基础的数学表达式以及来自 Numpy 和 Scipy 的一些数学表达式,可能包括一些分布。 As a running example, consider:
作为一个运行示例,请考虑:
import numpy as np
from scipy.stats import *
def my_process(args):
""" My process
"""
x1 = norm.rvs(loc=0, scale=1)
x2 = x1 + norm.rvs(loc=2, scale=0.5)
x3 = (x1 * np.exp(x2)) + norm.rvs(loc=-1, scale=2)
return x1, x2, x3
I want to write an interpreter of this function and make each one of the appearing variables into a class, which generically is written as follows:我想为这个function写一个解释器,把每一个出现的变量都变成一个class,一般写法如下:
class genericProcess():
def __init__(self):
pass
def process(self, parents):
""" This needs to be implemented for each class
"""
raise NotImplementedError
So for our example function, we would interpret the given function as the following three classes:因此,对于我们的示例 function,我们会将给定的 function 解释为以下三个类:
class x1Process(genericProcess):
def __init__(self):
pass
def process(self):
return norm.rvs(loc=0, scale=1)
class x2Process(genericProcess):
def __init__(self):
pass
def process(self, parents):
return parents["x1"] + norm.rvs(loc=2, scale=0.5)
class x3Process(genericProcess):
def __init__(self):
pass
def process(self, parents):
return (parents["x1"] * parents["x2"]) + norm.rvs(loc=-1, scale=2)
Is this even possible at all?这有可能吗? if yes, what would be the first steps to start implementing it, if not, what would make the problem well-posed so that I can start implementing it?
如果是,那么开始实施它的第一步是什么?如果不是,什么可以使问题适当化以便我可以开始实施它? For example, I thought having a string instead of a function might make the problem simpler, although I am not sure.
例如,我认为使用字符串而不是 function 可能会使问题更简单,尽管我不确定。
EDIT:编辑:
Thanks to the comments I can make the question a bit more concrete.感谢评论,我可以使问题更具体一些。 I want a function, called "my_interpreter" that takes as input a user specified function and outputs a dictionary where each key is a line of the function (or alternatively each key is one of the return elements of the function), and each item of the dictionary is a class that implements the "process" method of the "genericProcess" class. I our running example:
我想要一个 function,称为“my_interpreter”,它将用户指定的 function 作为输入,并输出一个字典,其中每个键是 function 的一行(或者每个键是函数的返回元素之一),以及该字典是一个 class,它实现了“genericProcess”class 的“process”方法。我们的运行示例:
interpreted_function_dictionary = my_interpreter(my_process)
with和
interpreted_function = {
"x1": x1Process,
"x2": x2Process,
"x3": x3Process
}
It's difficult to intercept definition.很难拦截定义。 You would need to parse the code with
ast
as suggested in the comments.您需要按照评论中的建议使用
ast
解析代码。
sympy
An alternative way of doing it is replacing all the math operations into their symbolic representation, which are executable at a later time.另一种方法是将所有数学运算替换为它们的符号表示,这些符号表示可以在以后执行。 The
sympy
package does exactly that and should contain most math operations you need. sympy
package 正是这样做的,应该包含您需要的大多数数学运算。 There is also the sympy.stats
which has most of the stats functions.还有
sympy.stats
具有大部分统计功能。 (Very similar to symbolic computation in matlab
with syms
.) (与
matlab
中带有syms
的符号计算非常相似。)
To use sympy
with numpy
backend, you can use their lambdify
function, eg要将
sympy
与numpy
后端一起使用,您可以使用他们的lambdify
function,例如
from sympy import sin, lambdify
from sympy.abc import x
expr = sin(x)/x
f = lambdify(x, expr, "numpy")
As of version 1.11, it doesn't seem to support scipy
yet.从1.11版本开始,它似乎还不支持
scipy
。
Similar to sympy
, you can create wrapper classes for all the math operations that would return an expression instead of the result.与
sympy
类似,您可以为所有返回表达式而不是结果的数学运算创建包装类。 Then, each expression would be your process
and you can evaluate each expression to get the resulting value.然后,每个表达式都是您的
process
,您可以计算每个表达式以获得结果值。
Not sure if this fits OP's requirement.不确定这是否符合 OP 的要求。
from dataclasses import dataclass, field
from typing import Any, ClassVar
import numpy as np
import scipy
@dataclass
class EvaluatableExpression:
name: str
args: Any = field(default_factory=tuple)
kwargs: Any = field(default_factory=dict)
package: ClassVar = None
def evaluate(self):
# recursively evaluate any executable args and kwargs
args = (arg.evaluate() if isinstance(arg, EvaluatableExpression) else arg for arg in self.args)
kwargs = {k: v.evaluate() if isinstance(v, EvaluatableExpression) else v for k, v in self.kwargs.items()}
return getattr(self.package, self.name)(*args, **kwargs)
@dataclass
class NumpyFunc(EvaluatableExpression):
package: ClassVar = np
@dataclass
class ScipyFunc(EvaluatableExpression):
package: ClassVar = scipy
@dataclass
class ScipyStats(EvaluatableExpression):
stats_package: str = ''
def __post_init__(self):
self.package = getattr(scipy.stats, self.stats_package)
For python math, you can handle them using magic methods:对于 python 数学,您可以使用魔术方法处理它们:
@dataclass
class PythonMath(EvaluatableExpression):
def evaluate(self):
# the function names are names of magic methods, e.g. '__add__',
# assuming only binary ops on args[0] and args[1]
op0 = self.args[0]
self.package = op0.evaluate() if isinstance(op0, EvaluatableExpression) else op0
# save args and load args later so it doesn't change args before and after evaluation
temp_args = self.args
self.args = self.args[1:]
result = super().evaluate()
self.args = temp_args
return result
@dataclass
class Operand:
content: Any
def __add__(self, other):
return PythonMath(name='__add__', args=(self.content, other))
def __sub__(self, other):
return PythonMath(name='__sub__', args=(self.content, other))
def __mul__(self, other):
return PythonMath(name='__mul__', args=(self.content, other))
def __truediv__(self, other):
return PythonMath(name='__truediv__', args=(self.content, other))
...
For Operand
it's not possible to catch magic methods
with __getattr__
or __getattribute__
.对于
Operand
,不可能使用__getattr__
或__getattribute__
捕捉magic methods
。 You can write custom metaclass to do that to simplify copying and pasting code.您可以编写自定义元类来执行此操作以简化复制和粘贴代码。
def process(args):
""" My process
"""
x1 = ScipyStats(stats_package='norm', name='rvs', kwargs={'loc': 0, 'scale': 1})
x2 = Operand(x1) + ScipyStats(stats_package='norm', name='rvs', kwargs={'loc': 2, 'scale': 0.5})
x3 = Operand(Operand(x1) * NumpyFunc(name='exp', args=(x2,))) + ScipyStats(stats_package='norm', name='rvs',
kwargs={'loc': -1, 'scale': 0.5})
return x1, x2, x3
Now, all the returned variables will be "expressions".现在,所有返回的变量都将是“表达式”。 We can see
我们可以看到
>>> print(x[0])
ScipyStats(name='rvs', args=(), kwargs={'loc': 0, 'scale': 1}, stats_package='norm')
>>> print(x[1])
PythonMath(name='__add__', args=(ScipyStats(name='rvs', args=(), kwargs={'loc': 0, 'scale': 1}, stats_package='norm'), ScipyStats(name='rvs', args=(), kwargs={'loc': 2, 'scale': 0.5}, stats_package='norm')), kwargs={})
>>> print(x[2])
PythonMath(name='__add__', args=(PythonMath(name='__mul__', args=(ScipyStats(name='rvs', args=(), kwargs={'loc': 0, 'scale': 1}, stats_package='norm'), NumpyFunc(name='exp', args=(PythonMath(name='__add__', args=(ScipyStats(name='rvs', args=(), kwargs={'loc': 0, 'scale': 1}, stats_package='norm'), ScipyStats(name='rvs', args=(), kwargs={'loc': 2, 'scale': 0.5}, stats_package='norm')), kwargs={}),), kwargs={})),
And evaluating them gives:评估它们给出:
>>> print(x[0].evaluate())
-1.331802485169775
>>> print(x[1].evaluate())
0.7789471967940289
>>> print(x[2].evaluate())
-60.03245897617831
Of course, you can make defining math expression prettier and more concise by defining aliases, eg borrowing from pyspark
library当然,你可以通过定义别名来让定义数学表达式更漂亮更简洁,例如借用
pyspark
库
def _create_function(name, doc=""):
""" Create a function for aggregator by name"""
def _(*args, **kwargs):
package, new_name = name.split('__')
if package == 'np':
cls = NumpyFunc
elif package == 'scipy':
cls = ScipyFunc
elif package == 'ss':
cls = ScipyStats
return cls(func=new_name, args=args, kwargs=kwargs)
_.__name__ = name
_.__doc__ = doc
return _
ALL = [f'np__{func}' for func in np.ma.__all__] + [f'scipy__{func}' for func in ...] +
...
for func_dict in ALL:
for _name, _doc in func_dict.items():
globals()[_name] = _create_function(_name, _doc)
del _name, _doc
Then you can have something like:然后你可以有类似的东西:
x1 = ss__norm_rvs(loc=0, scale=1)
x2 = Operand(x1) + ss__norm_rvs(loc=2, scale=0.5)
x3 = Operand(Operand(x1) * np__exp(x2)) + ss__norm_rvs(loc=-1, scale=2)
You could even get rid of the pesky Operand
by making everything a subclass of Operand
.您甚至可以通过使所有内容成为
Operand
的子类来摆脱讨厌的Operand
。
Hope this helps.希望这可以帮助。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.