简体   繁体   English

Python多处理:避免通过进程之间的模块进行通信

[英]Python multiprocessing: avoid communication via module between processes

I want to perform the following task: In the main program 'main.py', I define some input parameters, do a calculation based on these parameters using a function f() and store the result. 我要执行以下任务:在主程序“ main.py”中,我定义了一些输入参数,使用函数f()根据这些参数进行计算并存储结果。 The function f() and some of the parameters are defined in a central module 'test.py'. 函数f()和某些参数在中央模块“ test.py”中定义。

I have to do this for a large set of parameters and therefore want to give each CPU a set of parameters, perform the calculation and return the result which is then stored in an array 'data'. 我必须对大量参数执行此操作,因此要为每个CPU提供一组参数,执行计算并返回结果,然后将结果存储在数组“数据”中。

The problem: each process needs to access and define values in the module 'test.py' and I want to avoid any communication/interference between processes. 问题:每个进程都需要访问和定义模块“ test.py”中的值,我想避免进程之间的任何通信/干扰。

I attached a minimal working example. 我附上了一个最小的工作示例。 The main file main.py and the module test.py 主文件main.py和模块test.py

If one performs the calculation one sees that the results in 'data' are correct, but the print statement returns pairs (a, b) which do not correspond to the default values. 如果执行计算,则会看到“数据”中的结果正确,但是print语句返回的对(a,b)与默认值不对应。

At first, I want to understand what happens here. 首先,我想了解这里发生的情况。 It seems that each process prints (a, b) defined by previous processes, then defines the new values and yields the correct result. 似乎每个过程都打印由先前过程定义的(a,b),然后定义新值并产生正确的结果。

Second, for the moment the program works (even for larger datasets and much more complicated calculations), but I don't want to risk wrong results by interference between processes. 其次,目前该程序可以运行(即使对于较大的数据集和更复杂的计算),但我不想冒流程之间相互干扰的错误结果的风险。 Is there a way to avoid any communication between the processes? 有没有办法避免流程之间的任何交流? Maybe each process gets a copy of the module and does the calculation using this copy? 也许每个进程都获得模块的副本,并且使用此副本进行计算吗?

I think your issue is that the "print" statement is printing what the parent main.py process sees as ta and tb, which you have assigned in your calc(x) function, (I don't think you can print in the child worker process but you definitely arent and therefore I don't see how you could see the default values (1,1). You print ta and tb BEFORE you assign the new value, and therefore it will print the old value? 我认为您的问题是“ print”语句正在打印您在calc(x)函数中分配的父main.py进程所看到的ta和tb((我认为您不能在孩子中打印)工作进程,但您肯定不满意,因此我看不到如何看到默认值(1,1)。在分配新值之前先打印ta和tb,然后它将打印旧值?

If you really want to make sure all your processes are 100% independent of the main process you could pass all your arguements in the one structure ie. 如果您真的想确保所有过程都与主过程100%独立,则可以在一个结构中通过所有论点。 Define in your test.py 在您的test.py中定义

def f(struct):
    return (struct.a+struct.b)**struct.s

and in your main.py make a list of these structures. 并在您的main.py中列出这些结构。 So I guess you define a structure and populate it 所以我想你定义一个结构并填充它

class myStruct():
    def __init__(self,s,a=1,b=1): ##Here you've set the default a and b values to 1
        self.a=a
        self.b=b
        self.s=s

You could then populate a list of these structures then pass the list to your multiprocessing.pool 然后,您可以填充这些结构的列表,然后将该列表传递给multiprocessing.pool。

Not sure if that was super helpful but i dont have enough reputation to make a comment. 不知道这是否超级有用,但我没有足够的声誉来发表评论。

Cheeers mate and goodluck. 欢呼队友和好运。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM