死简单示例中的线程锁定失败

Question

This is the simplest toy example.这是最简单的玩具示例。 I know about concurrent.futures and higher level code;我知道 concurrent.futures 和更高级别的代码； I'm picking the toy example because I'm teaching it (as part of same material with the high-level stuff).我选择玩具示例是因为我正在教授它（作为与高级内容相同的材料的一部分）。

It steps on counter from different threads, and I get... well, here it is even weirder.它从不同的线程踩到柜台，我得到......好吧，这里更奇怪。 Usually I get a counter less than I should (eg 5M), generally much less like 20k.通常我得到的计数器比我应该得到的要少（例如 5M），通常比 20k 少得多。 But as I decrease the number of loops, at some number like 1000 it is consistently right.但是当我减少循环次数时，像 1000 这样的数字始终是正确的。 Then at some intermediate number, I get almost right , occasionally correct, but once in a while slightly larger than the product of nthread x nloop.然后，在某个中间数，我得到差不多吧，偶尔正确的，但过一段时间比nthread X NLOOP个产品略大。 I am running it repeatedly in a Jupyter cell, but the first line really should reset counter to zero, not keep any old total.我在Jupyter细胞反复运行它，但第一行真的应该重新设置为零，不保留任何老总。

lock = threading.Lock()
counter, nthread, nloop = 0, 100, 50_000 

def increment(n, lock):
    global counter
    for _ in range(n):
        lock.acquire()
        counter += 1
        lock.release()

for _ in range(nthread):
    t = Thread(target=increment, args=(nloop, lock))
    t.start()
    
print(f"{nloop:,} loops X {nthread:,} threads -> counter is {counter:,}")

If I add .join() the behavior changes, but is still not correct.如果我添加.join()行为会改变，但仍然不正确。 For example, in the version that doesn't try to lock:例如，在不尝试锁定的版本中：

counter, nthread, nloop = 0, 100, 50_000 

def increment(n):
    global counter
    for _ in range(n):
        counter += 1

for _ in range(nthread):
    t = Thread(target=increment, args=(nloop,))
    t.start()
    t.join()
    
print(f"{nloop:,} loops X {nthread:,} threads -> counter is {counter:,}")
# --> 50,000 loops X 100 threads -> counter is 5,022,510

The exact overcount varies, but I see something like that repeatedly.确切的超额计数各不相同，但我反复看到类似的情况。

I don't really want to .join() in the lock example, because I want to illustrate the idea of a background job.我真的不想在锁定示例中使用.join() ，因为我想说明后台作业的想法。 But I can wait for the aliveness of the thread (thank you Frank Yellin!), and that fixes the lock case.但是我可以等待线程的活跃度（谢谢 Frank Yellin！），这样就解决了锁问题。 The overcount still troubles me though.不过，超额计算仍然困扰着我。

Answer 1

You're not waiting until all your threads are done before looking at counter .在查看counter之前，您不会等到所有线程都完成。 That's also why you're getting your result so quickly.这也是您如此迅速地获得结果的原因。

    threads = []
    for _ in range(nthread):
        t = threading.Thread(target=increment, args=(nloop, lock))
        t.start()
        threads.append(t)

    for thread in threads:
        thread.join()

    print(f"{nloop:,} loops X {nthread:,} threads -> counter is {counter:,}")

prints out the expected result.打印出预期的结果。

50,000 loops X 100 threads -> counter is 5,000,000

Updated.更新。 I highly recommend using ThreadPoolExecutor() instead, which takes care of tracking the threads for you.我强烈建议改用 ThreadPoolExecutor()，它会为您跟踪线程。

    with ThreadPoolExecutor() as executor:
        for _ in range(nthread):
            executor.submit(increment, nloop, lock)
    print(...)

will give you the answer you want, and takes care of waiting for the threads.会给你你想要的答案，并负责等待线程。

死简单示例中的线程锁定失败

问题描述

1 个解决方案

解决方案1
2 2020-09-19 22:25:04

死简单示例中的线程锁定失败

问题描述

1 个解决方案

解决方案1 2 2020-09-19 22:25:04

解决方案1
2 2020-09-19 22:25:04