简体   繁体   English

死简单示例中的线程锁定失败

[英]Thread locking failing in dead-simple example

This is the simplest toy example.这是最简单的玩具示例。 I know about concurrent.futures and higher level code;我知道 concurrent.futures 和更高级别的代码; I'm picking the toy example because I'm teaching it (as part of same material with the high-level stuff).我选择玩具示例是因为我正在教授它(作为与高级内容相同的材料的一部分)。

It steps on counter from different threads, and I get... well, here it is even weirder.它从不同的线程踩到柜台,我得到......好吧,这里更奇怪。 Usually I get a counter less than I should (eg 5M), generally much less like 20k.通常我得到的计数器比我应该得到的要少(例如 5M),通常比 20k 少得多。 But as I decrease the number of loops, at some number like 1000 it is consistently right.但是当我减少循环次数时,像 1000 这样的数字始终是正确的。 Then at some intermediate number, I get almost right , occasionally correct, but once in a while slightly larger than the product of nthread x nloop.然后,在某个中间数,我得到差不多吧,偶尔正确的,但过一段时间比nthread X NLOOP个产品略大 I am running it repeatedly in a Jupyter cell, but the first line really should reset counter to zero, not keep any old total.在Jupyter细胞反复运行它,但第一行真的应该重新设置为零,不保留任何老总。

lock = threading.Lock()
counter, nthread, nloop = 0, 100, 50_000 

def increment(n, lock):
    global counter
    for _ in range(n):
        lock.acquire()
        counter += 1
        lock.release()

for _ in range(nthread):
    t = Thread(target=increment, args=(nloop, lock))
    t.start()
    
print(f"{nloop:,} loops X {nthread:,} threads -> counter is {counter:,}")

If I add .join() the behavior changes, but is still not correct.如果我添加.join()行为会改变,但仍然不正确。 For example, in the version that doesn't try to lock:例如,在不尝试锁定的版本中:

counter, nthread, nloop = 0, 100, 50_000 

def increment(n):
    global counter
    for _ in range(n):
        counter += 1

for _ in range(nthread):
    t = Thread(target=increment, args=(nloop,))
    t.start()
    t.join()
    
print(f"{nloop:,} loops X {nthread:,} threads -> counter is {counter:,}")
# --> 50,000 loops X 100 threads -> counter is 5,022,510

The exact overcount varies, but I see something like that repeatedly.确切的超额计数各不相同,但我反复看到类似的情况。

I don't really want to .join() in the lock example, because I want to illustrate the idea of a background job.我真的不想在锁定示例中使用.join() ,因为我想说明后台作业的想法。 But I can wait for the aliveness of the thread (thank you Frank Yellin!), and that fixes the lock case.但是我可以等待线程的活跃度(谢谢 Frank Yellin!),这样就解决了锁问题。 The overcount still troubles me though.不过,超额计算仍然困扰着我。

You're not waiting until all your threads are done before looking at counter .在查看counter之前,您不会等到所有线程都完成。 That's also why you're getting your result so quickly.这也是您如此迅速地获得结果的原因。

    threads = []
    for _ in range(nthread):
        t = threading.Thread(target=increment, args=(nloop, lock))
        t.start()
        threads.append(t)

    for thread in threads:
        thread.join()

    print(f"{nloop:,} loops X {nthread:,} threads -> counter is {counter:,}")

prints out the expected result.打印出预期的结果。

50,000 loops X 100 threads -> counter is 5,000,000

Updated.更新。 I highly recommend using ThreadPoolExecutor() instead, which takes care of tracking the threads for you.我强烈建议改用 ThreadPoolExecutor(),它会为您跟踪线程。

    with ThreadPoolExecutor() as executor:
        for _ in range(nthread):
            executor.submit(increment, nloop, lock)
    print(...)

will give you the answer you want, and takes care of waiting for the threads.会给你你想要的答案,并负责等待线程。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM