简体   繁体   中英

Thread locking failing in dead-simple example

This is the simplest toy example. I know about concurrent.futures and higher level code; I'm picking the toy example because I'm teaching it (as part of same material with the high-level stuff).

It steps on counter from different threads, and I get... well, here it is even weirder. Usually I get a counter less than I should (eg 5M), generally much less like 20k. But as I decrease the number of loops, at some number like 1000 it is consistently right. Then at some intermediate number, I get almost right , occasionally correct, but once in a while slightly larger than the product of nthread x nloop. I am running it repeatedly in a Jupyter cell, but the first line really should reset counter to zero, not keep any old total.

lock = threading.Lock()
counter, nthread, nloop = 0, 100, 50_000 

def increment(n, lock):
    global counter
    for _ in range(n):
        lock.acquire()
        counter += 1
        lock.release()

for _ in range(nthread):
    t = Thread(target=increment, args=(nloop, lock))
    t.start()
    
print(f"{nloop:,} loops X {nthread:,} threads -> counter is {counter:,}")

If I add .join() the behavior changes, but is still not correct. For example, in the version that doesn't try to lock:

counter, nthread, nloop = 0, 100, 50_000 

def increment(n):
    global counter
    for _ in range(n):
        counter += 1

for _ in range(nthread):
    t = Thread(target=increment, args=(nloop,))
    t.start()
    t.join()
    
print(f"{nloop:,} loops X {nthread:,} threads -> counter is {counter:,}")
# --> 50,000 loops X 100 threads -> counter is 5,022,510

The exact overcount varies, but I see something like that repeatedly.

I don't really want to .join() in the lock example, because I want to illustrate the idea of a background job. But I can wait for the aliveness of the thread (thank you Frank Yellin!), and that fixes the lock case. The overcount still troubles me though.

You're not waiting until all your threads are done before looking at counter . That's also why you're getting your result so quickly.

    threads = []
    for _ in range(nthread):
        t = threading.Thread(target=increment, args=(nloop, lock))
        t.start()
        threads.append(t)

    for thread in threads:
        thread.join()

    print(f"{nloop:,} loops X {nthread:,} threads -> counter is {counter:,}")

prints out the expected result.

50,000 loops X 100 threads -> counter is 5,000,000

Updated. I highly recommend using ThreadPoolExecutor() instead, which takes care of tracking the threads for you.

    with ThreadPoolExecutor() as executor:
        for _ in range(nthread):
            executor.submit(increment, nloop, lock)
    print(...)

will give you the answer you want, and takes care of waiting for the threads.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM