简体   繁体   English

Python 多处理不确定(经理)?

[英]Python multiprocessing not deterministic (Manager)?

I am trying to share a dict via the Manager interface and it seems the results vary!我正在尝试通过Manager界面共享一个字典,结果似乎有所不同! Sometimes it is {1: 8, 2: 3, 3: 2, 4: 1} and other times {1: 6, 2: 3, 3: 2, 4: 1} , {1: 7, 2: 3, 3: 2, 4: 1} , etc. This is just counting the divisors and should work deterministically...有时是{1: 8, 2: 3, 3: 2, 4: 1}有时是{1: 6, 2: 3, 3: 2, 4: 1} , {1: 7, 2: 3, 3: 2, 4: 1}等。这只是计算除数,应该确定性地工作......

The code is here:代码在这里:

from multiprocessing import Process,  Manager
def div(x,d):
    for i in range(1,x):
        if x%i == 0:
            try:
                d[i] +=1
            except:
                d[i]=1

mgr = Manager()
d = mgr.dict()
w = [Process(target=div,args=(i,d)) for i in range(1,10)]

for k in w:
    k.start()
for k in w:
    k.join()

print d

There is a race condition in your code, right here:您的代码中有一个竞争条件,就在这里:

                        try:
                                d[i] += 1
                        except:
                                d[i] = 1

Consider what happens if d[i] does not yet exist and if two processes reach d[i] += 1 at about the same time.考虑如果d[i]还不存在并且两个进程大约同时达到d[i] += 1会发生什么。 Both will throw an exception, and both will execute d[i] = 1 .两者都会抛出异常,并且都将执行d[i] = 1 End result: d[i] is 1 instead of 2 .最终结果: d[i]1而不是2 You've lost an increment!你失去了一个增量!

Upon closer inspection, even d[i] += 1 alone might not be atomic and thus open to race conditions.经过仔细检查,即使d[i] += 1单独也可能不是原子的,因此对竞争条件开放。 Internally, d[i] += 1 gets executed as the following sequence of operations:在内部, d[i] += 1被执行为以下操作序列:

  • get value at index i ;在索引i处获取值;
  • increment the value;增加值;
  • set value at index i .在索引i处设置值。

Each of the three operations is atomic and correct, but there appears to be nothing to guarantee the atomicity of the entire sequence.这三个操作中的每一个都是原子且正确的,但似乎没有什么可以保证整个序列的原子性。 If two processes attempt to execute d[i] += 1 for the same i concurrently, one of the increments could get lost for the reasons I've explained above.如果两个进程尝试同时为同一个i执行d[i] += 1 ,则其中一个增量可能会由于我上面解释的原因而丢失。

An alternative to using a shared dictionary is for each process to maintain its own set of counts, and aggregate these sets at the end.使用共享字典的另一种方法是让每个进程维护自己的计数集,并在最后聚合这些集。 This way it's harder to introduce subtle bugs.这样就很难引入细微的错误。 It may also lead to better performance characteristics since there will be less need for interprocess communications.它还可能导致更好的性能特征,因为对进程间通信的需求将减少。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM