簡體   English   中英

在單獨的線程中啟動 asyncio 事件循環並消耗隊列項

[英]Start asyncio event loop in separate thread and consume queue items

我正在編寫一個 Python 程序,它同時運行從隊列中提取的任務,以學習asyncio

通過與主線程(在 REPL 內)交互,項目將被放入隊列中。 每當一個任務被放入隊列時,它應該被立即消耗和執行。 我的方法是啟動一個單獨的線程並將隊列傳遞給該線程內的事件循環。

任務正在運行,但只是按順序運行,我不清楚如何同時運行這些任務。 我的嘗試如下:

import asyncio
import time
import queue
import threading

def do_it(task_queue):
    '''Process tasks in the queue until the sentinel value is received'''
    _sentinel = 'STOP'

    def clock():
        return time.strftime("%X")

    async def process(name, total_time):
        status = f'{clock()} {name}_{total_time}:'
        print(status, 'START')
        current_time = time.time()
        end_time = current_time + total_time
        while current_time < end_time:
            print(status, 'processing...')
            await asyncio.sleep(1)
            current_time = time.time()
        print(status, 'DONE.')

    async def main():
        while True:
            item = task_queue.get()
            if item == _sentinel:
                break
            await asyncio.create_task(process(*item))

    print('event loop start')
    asyncio.run(main())
    print('event loop end')


if __name__ == '__main__':
    tasks = queue.Queue()
    th = threading.Thread(target=do_it, args=(tasks,))
    th.start()

    tasks.put(('abc', 5))
    tasks.put(('def', 3))

任何指導我同時運行這些任務的建議將不勝感激!
謝謝

更新
謝謝 Frank Yellin 和 cynthi8! 我根據您的建議對 main() 進行了改造:

  • asyncio.create_task之前移除await - 固定並發
  • 添加了 wait while 循環,以便 main 不會過早返回
  • 使用 Queue.get() 的非阻塞模式

該程序現在按預期工作👍

更新 2
user4815162342 提供了進一步的改進,我在下面注釋了他的建議。

'''
Starts auxiliary thread which establishes a queue and consumes tasks within a
queue.
    
Allow enqueueing of tasks from within __main__ and termination of aux thread
'''
import asyncio
import time
import threading
import functools

def do_it(started):
    '''Process tasks in the queue until the sentinel value is received'''
    _sentinel = 'STOP'

    def clock():
        return time.strftime("%X")

    async def process(name, total_time):
        print(f'{clock()} {name}_{total_time}:', 'Started.')
        current_time = time.time()
        end_time = current_time + total_time
        while current_time < end_time:
            print(f'{clock()} {name}_{total_time}:', 'Processing...')
            await asyncio.sleep(1)
            current_time = time.time()
        print(f'{clock()} {name}_{total_time}:', 'Done.')

    async def main():
        # get_running_loop() get the running event loop in the current OS thread
        # out to __main__ thread
        started.loop = asyncio.get_running_loop()
        started.queue = task_queue = asyncio.Queue()
        started.set()
        while True:
            item = await task_queue.get()
            if item == _sentinel:
                # task_done is used to tell join when the work in the queue is 
                # actually finished. A queue length of zero does not mean work
                # is complete.
                task_queue.task_done()
                break
            task = asyncio.create_task(process(*item))
            # Add a callback to be run when the Task is done.
            # Indicate that a formerly enqueued task is complete. Used by queue 
            # consumer threads. For each get() used to fetch a task, a 
            # subsequent call to task_done() tells the queue that the processing
            # on the task is complete.
            task.add_done_callback(lambda _: task_queue.task_done())            

        # keep loop going until all the work has completed
        # When the count of unfinished tasks drops to zero, join() unblocks.
        await task_queue.join()

    print('event loop start')
    asyncio.run(main())
    print('event loop end')

if __name__ == '__main__':
    # started Event is used for communication with thread th
    started = threading.Event()
    th = threading.Thread(target=do_it, args=(started,))
    th.start()
    # started.wait() blocks until started.set(), ensuring that the tasks and
    # loop variables are available from the event loop thread
    started.wait()
    tasks, loop = started.queue, started.loop

    # call_soon schedules the callback callback to be called with args arguments
    # at the next iteration of the event loop.
    # call_soon_threadsafe is required to schedule callbacks from another thread 
    
    # put_nowait enqueues items in non-blocking fashion, == put(block=False)
    loop.call_soon_threadsafe(tasks.put_nowait, ('abc', 5))
    loop.call_soon_threadsafe(tasks.put_nowait, ('def', 3))
    loop.call_soon_threadsafe(tasks.put_nowait, 'STOP')

正如其他人指出的那樣,您的代碼的問題在於它使用了一個阻塞隊列,該隊列在等待下一項時停止事件循環。 然而,所提出的解決方案的問題在於它引入了延遲,因為它必須偶爾休眠以允許其他任務運行。 除了引入延遲之外,它還可以防止程序進入睡眠狀態,即使隊列中沒有項目也是如此。

另一種方法是切換到專為與 asyncio 一起使用而設計的asyncio 隊列 這個隊列必須在運行循環內創建,所以你不能把它傳遞給do_it ,你必須檢索它。 此外,由於它是一個 asyncio 原語,它的put方法必須通過call_soon_threadsafe調用以確保事件循環注意到它。

最后一個問題是您的main()函數使用另一個繁忙循環來等待所有任務完成。 這可以通過使用Queue.join來避免,它是專門為此用例設計的。

這是您的代碼經過調整以包含上述所有建議, process功能與您的原始代碼保持不變:

import asyncio
import time
import threading

def do_it(started):
    '''Process tasks in the queue until the sentinel value is received'''
    _sentinel = 'STOP'

    def clock():
        return time.strftime("%X")

    async def process(name, total_time):
        status = f'{clock()} {name}_{total_time}:'
        print(status, 'START')
        current_time = time.time()
        end_time = current_time + total_time
        while current_time < end_time:
            print(status, 'processing...')
            await asyncio.sleep(1)
            current_time = time.time()
        print(status, 'DONE.')

    async def main():
        started.loop = asyncio.get_running_loop()
        started.queue = task_queue = asyncio.Queue()
        started.set()
        while True:
            item = await task_queue.get()
            if item == _sentinel:
                task_queue.task_done()
                break
            task = asyncio.create_task(process(*item))
            task.add_done_callback(lambda _: task_queue.task_done())
        await task_queue.join()

    print('event loop start')
    asyncio.run(main())
    print('event loop end')

if __name__ == '__main__':
    started = threading.Event()
    th = threading.Thread(target=do_it, args=(started,))
    th.start()
    started.wait()
    tasks, loop = started.queue, started.loop

    loop.call_soon_threadsafe(tasks.put_nowait, ('abc', 5))
    loop.call_soon_threadsafe(tasks.put_nowait, ('def', 3))
    loop.call_soon_threadsafe(tasks.put_nowait, 'STOP')

注意:與您的代碼無關的問題是它等待create_task()的結果,這使create_task()的用處無效,因為它不允許在后台運行。 (這相當於立即加入您剛開始的線程 - 您可以這樣做,但沒有多大意義。)此問題在上述代碼和您對問題的編輯中均已解決。

您的代碼有兩個問題。

首先,你不應該在asyncio.create_task之前await 這可能是導致您的代碼同步運行的原因。

然后,一旦您使代碼異步運行,您需要在main的 while 循環之后執行一些操作,以便代碼不會立即返回,而是等待所有作業完成。 另一個stackoverflow 答案建議:

while len(asyncio.Task.all_tasks()) > 1:  # Any task besides main() itself?
    await asyncio.sleep(0.2)

或者,有一些版本的Queue可以跟蹤正在運行的任務。

作為一個額外的問題:

如果 queue.Queue 為空,則默認情況下 get() 會阻塞並且不返回標記字符串。 https://docs.python.org/3/library/queue.html

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM