简体   繁体   English

使用 asyncio.Queue 进行生产者-消费者流

[英]Using asyncio.Queue for producer-consumer flow

I'm confused about how to use asyncio.Queue for a particular producer-consumer pattern in which both the producer and consumer operate concurrently and independently.我对如何将asyncio.Queue用于特定的生产者-消费者模式感到困惑,在这种模式中,生产者和消费者同时独立运行。

First, consider this example, which closely follows that from the docs for asyncio.Queue :首先,考虑这个例子,它紧跟asyncio.Queue文档中的asyncio.Queue

import asyncio
import random
import time

async def worker(name, queue):
    while True:
        sleep_for = await queue.get()
        await asyncio.sleep(sleep_for)
        queue.task_done()
        print(f'{name} has slept for {sleep_for:0.2f} seconds')

async def main(n):
    queue = asyncio.Queue()
    total_sleep_time = 0
    for _ in range(20):
        sleep_for = random.uniform(0.05, 1.0)
        total_sleep_time += sleep_for
        queue.put_nowait(sleep_for)
    tasks = []
    for i in range(n):
        task = asyncio.create_task(worker(f'worker-{i}', queue))
        tasks.append(task)
    started_at = time.monotonic()
    await queue.join()
    total_slept_for = time.monotonic() - started_at
    for task in tasks:
        task.cancel()
    # Wait until all worker tasks are cancelled.
    await asyncio.gather(*tasks, return_exceptions=True)
    print('====')
    print(f'3 workers slept in parallel for {total_slept_for:.2f} seconds')
    print(f'total expected sleep time: {total_sleep_time:.2f} seconds')

if __name__ == '__main__':
    import sys
    n = 3 if len(sys.argv) == 1 else sys.argv[1]
    asyncio.run(main())

There is one finer detail about this script: the items are put into the queue synchronously, with queue.put_nowait(sleep_for) over a conventional for-loop.这个脚本有一个更详细的细节:项目同步放入队列,使用queue.put_nowait(sleep_for)在传统的 for 循环上。

My goal is to create a script that uses async def worker() (or consumer() ) and async def producer() .我的目标是创建一个使用async def worker() (或consumer() )和async def producer()的脚本。 Both should be scheduled to run concurrently.两者都应安排为同时运行。 No one consumer coroutine is explicitly tied to or chained from a producer.没有任何消费者协程与生产者明确绑定或链接。

How can I modify the program above so that the producer(s) is its own coroutine that can be scheduled concurrently with the consumers/workers?我如何修改上面的程序,以便生产者是它自己的协同程序,可以与消费者/工人同时调度?


There is a second example from PYMOTW . PYMOTW有第二个例子。 It requires the producer to know the number of consumers ahead of time, and uses None as a signal to the consumer that production is done.它要求生产者提前知道消费者的数量,并使用None作为生产完成的信号给消费者。

How can I modify the program above so that the producer(s) is its own coroutine that can be scheduled concurrently with the consumers/workers?我如何修改上面的程序,以便生产者是它自己的协同程序,可以与消费者/工人同时调度?

The example can be generalized without changing its essential logic:该示例可以在不改变其基本逻辑的情况下进行推广:

  • Move the insertion loop to a separate producer coroutine.将插入循环移动到单独的生产者协程。
  • Start the consumers in the background, letting them process the items as they are produced.在后台启动消费者,让他们在生产商品时对其进行处理。
  • With the consumers running, start the producers and wait for them to finish producing items, as with await producer() or await gather(*producers) , etc.随着消费者运行,启动生产者并等待他们完成生产项目,如await producer()await gather(*producers)等。
  • Once all producers are done, wait for consumers to process the remaining items with await queue.join() .完成所有生产者后,等待消费者使用await queue.join()处理剩余的项目。
  • Cancel the consumers, all of which are now idly waiting for the queue to deliver the next item, which will never arrive as we know the producers are done.取消消费者,所有消费者现在都在空闲地等待队列交付下一个项目,因为我们知道生产者已经完成,下一个项目永远不会到达。

Here is an example implementing the above:这是实现上述内容的示例:

import asyncio, random
 
async def rnd_sleep(t):
    # sleep for T seconds on average
    await asyncio.sleep(t * random.random() * 2)
 
async def producer(queue):
    while True:
        # produce a token and send it to a consumer
        token = random.random()
        print(f'produced {token}')
        if token < .05:
            break
        await queue.put(token)
        await rnd_sleep(.1)
 
async def consumer(queue):
    while True:
        token = await queue.get()
        # process the token received from a producer
        await rnd_sleep(.3)
        queue.task_done()
        print(f'consumed {token}')
 
async def main():
    queue = asyncio.Queue()
 
    # fire up the both producers and consumers
    producers = [asyncio.create_task(producer(queue))
                 for _ in range(3)]
    consumers = [asyncio.create_task(consumer(queue))
                 for _ in range(10)]
 
    # with both producers and consumers running, wait for
    # the producers to finish
    await asyncio.gather(*producers)
    print('---- done producing')
 
    # wait for the remaining tasks to be processed
    await queue.join()
 
    # cancel the consumers, which are now idle
    for c in consumers:
        c.cancel()
 
asyncio.run(main())

Note that in real-life producers and consumers, especially those that involve network access, you probably want to catch IO-related exceptions that occur during processing.请注意,在现实生活中的生产者和消费者中,尤其是涉及网络访问的生产者和消费者中,您可能希望捕获处理过程中发生的 IO 相关异常。 If the exception is recoverable, as most network-related exceptions are, you can simply catch the exception and log the error.如果异常是可恢复的,就像大多数与网络相关的异常一样,您可以简单地捕获异常并记录错误。 You should still invoke task_done() because otherwise queue.join() will hang due to an unprocessed item.您仍然应该调用task_done()因为否则queue.join()将由于未处理的项目而挂起。 If it makes sense to re-try processing the item, you can return it into the queue prior to calling task_done() .如果重新尝试处理该项目有意义,您可以在调用task_done()之前将其返回到队列中。 For example:例如:

# like the above, but handling exceptions during processing:
async def consumer(queue):
    while True:
        token = await queue.get()
        try:
            # this uses aiohttp or whatever
            await process(token)
        except aiohttp.ClientError as e:
            print(f"Error processing token {token}: {e}")
            # If it makes sense, return the token to the queue to be
            # processed again. (You can use a counter to avoid
            # processing a faulty token infinitely.)
            #await queue.put(token)
        queue.task_done()
        print(f'consumed {token}')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM