简体   繁体   中英

How to properly implement producer consumer in python

I have two threads in a producer consumer pattern. When the consumer receives data it calls an time consuming function expensive() and then enters in a for loop.

But if while the consumer is working new data arrives, it should abort the current work, (exit the loop) and start with the new data.

I tried with a queue.Queue something like this:

q = queue.Queue()

def producer():
    while True:
        ...
        q.put(d)
      
def consumer():
    while True:
        d = q.get()
        expensive(d)
        for i in range(10000):
            ...
            if not q.empty():
                break
    

But the problem with this code is that if the producer put data too too fast, and the queue get to have many items, the consumer will do the expensive(d) call plus one loop iteration and then abort for each item, which is time consuming. The code should work, but is not optimized.

Without modifying the code in expensive one solution could be to run it as a separate process which will provide you the ability to terminateit prematurely. Since there's no mention to how long expensive runs this may or may not be more time efficient, however.

import multiprocessing as mp

q = queue.Queue()


def producer():
    while True:
        ...
        q.put(d)
  
def consumer():
    while True:
        d = q.get()
        exp = mp.Thread(target=expensive, args=(d,))
        for i in range(10000):
            ...
            if not q.empty():
                exp.terminate() # or exp.kill()
                break

Well, one way is to use a queue design that can keep an internal lists of waiting and working threads. You can then create several consumer threads to wait on the queue and, when work arrives, set a known consumer thread to do the work. When the thread has finished, it calls into the queue to remove itself from the working list and add itself to the waiting list.

The consumer threads each have an 'abort' atomic that can signal the thread to finish early. There will be some latency while the thread performs inner loops, but that will not matter....

If new work arrives at the queue from the producer, and the working queue is not empty, the 'abort' bool of the working thread/s can be set and their priority set to the minimum possible. The new work can then be dispatched onto one of the waiting threads from the pool, so setting it working.

The waiting threads will need a 'start' function that signals an event/sema/condvar that the wait thread..well..waits on. That allows the producer that supplied work to set that specific thread running, rather than the 'usual' practice where any thread from a pool may pick up work.

Such a design allows new work to be started 'immediately', makes the previous work thread irrelevant by de-prioritizing it and avoids the overheads of thread/process termination.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM