简体   繁体   English

动态添加Python asyncio的事件循环应该执行的列表

[英]Dynamically add to list of what Python asyncio's event loop should execute

I've got a function download_all that iterates through a hardcoded list of pages to download them all in sequence. 我有一个函数download_all,它遍历一个硬编码的页面列表,按顺序下载它们。 But if I'd like to dynamically add to the list based on the results of a page, how can I do it? 但是如果我想根据页面的结果动态添加到列表中,我该怎么办呢? For example download the first page, parse it, and based on the results add others to the event loop. 例如,下载第一页,解析它,并根据结果将其他页面添加到事件循环中。

@asyncio.coroutine
def download_all():
    first_page = 1
    last_page = 100
    download_list = [download(page_number) for page_number in range(first_page, last_page)]
    gen = asyncio.wait(download_list)
    return gen

if __name__ == '__main__':
    loop = asyncio.get_event_loop()
    futures = loop.run_until_complete(download_all())

One way to accomplish this is by using a Queue. 实现此目的的一种方法是使用队列。

#!/usr/bin/python3

import asyncio

try:  
    # python 3.4
    from asyncio import JoinableQueue as Queue
except:  
    # python 3.5
    from asyncio import Queue

@asyncio.coroutine
def do_work(task_name, work_queue):
    while not work_queue.empty():
        queue_item = work_queue.get_nowait()

        # simulate condition where task is added dynamically
        if queue_item % 2 != 0:
            work_queue.put_nowait(2)
            print('Added additional item to queue')

        print('{0} got item: {1}'.format(task_name, queue_item))
        yield from asyncio.sleep(queue_item)
        print('{0} finished processing item: {1}'.format(task_name, queue_item))

if __name__ == '__main__':

    queue = Queue()

    # Load initial jobs into queue
    [queue.put_nowait(x) for x in range(1, 6)] 

    # use 3 workers to consume tasks
    taskers = [ 
        do_work('task1', queue),
        do_work('task2', queue),
        do_work('task3', queue)
    ]   

    loop = asyncio.get_event_loop()
    loop.run_until_complete(asyncio.wait(taskers))
    loop.close()

Using a queue from asyncio you can ensure that the "units" of work are separate from the tasks/futures that are given to asyncio's event loop initially. 使用asyncio中的队列,您可以确保工作的“单位”与最初给予asyncio事件循环的任务/期货分开。 Basically this allows for the addition of extra "units" of work given some condition. 基本上,这允许在某些条件下添加额外的“单位”工作。

Note that in the example above even numbered tasks are terminal so an additional task is not added if that is the case. 请注意,在上面的示例中,偶数编号的任务是终端,因此如果是这种情况,则不会添加其他任务。 This eventually results in the completion of all tasks, but in your case you could easily use another condition to determine whether another item is added to the queue or not. 这最终会导致所有任务的完成,但在您的情况下,您可以轻松使用其他条件来确定是否将其他项添加到队列中。

Output: 输出:

Added additional item to queue
task2 got item: 1
task1 got item: 2
Added additional item to queue
task3 got item: 3
task2 finished processing item: 1
task2 got item: 4
task1 finished processing item: 2
Added additional item to queue
task1 got item: 5
task3 finished processing item: 3
task3 got item: 2
task3 finished processing item: 2
task3 got item: 2
task2 finished processing item: 4
task2 got item: 2
task1 finished processing item: 5
task3 finished processing item: 2
task2 finished processing item: 2

Please take a look on Web Crawler example . 请查看Web Crawler示例

It uses asyncio.JoinableQueue queue to storing urls for fetch tasks, but demonstrate a lot of useful techniques also. 它使用asyncio.JoinableQueue队列来存储获取任务的URL,但也展示了许多有用的技术。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM