简体   繁体   English

在asyncio中使用run_in_executor时,事件循环是否在主线程中执行?

[英]When using run_in_executor in asyncio, is the event loop executed in the main thread?

I am experimenting with multi-thread execution using the asyncio feature.我正在尝试使用 asyncio 功能进行多线程执行。
It is my understanding that loop manages threads and executes functions on threads.我的理解是循环管理线程并在线程上执行函数。
If this is the case, then it should not be possible to launch a new function while the main thread is still processing.如果是这种情况,那么当主线程仍在处理时,应该不可能启动一个新的 function。
However, when I run the following code, it starts executing a new function while I am asleep.但是,当我运行以下代码时,它会在我睡着的时候开始执行新的 function。 Could you tell me the reason?你能告诉我原因吗?

ref: https://stackoverflow.com/a/60747799参考: https://stackoverflow.com/a/60747799

import asyncio
from concurrent.futures.thread import ThreadPoolExecutor
from time import sleep
import logging

logging.basicConfig(
    level=logging.DEBUG, format="%(asctime)s %(thread)s %(funcName)s %(message)s"
)

def long_task(t):
    """Simulate long IO bound task."""
    logging.info("2. t: %s", t)
    sleep(t)
    logging.info("5. t: %s", t)
    return t ** 2

async def main():
    loop = asyncio.get_running_loop()
    executor = ThreadPoolExecutor(max_workers=2)
    inputs = range(1, 5)
    logging.info("1.")
    futures = [loop.run_in_executor(executor, long_task, i) for i in inputs]
    logging.info("3.")
    sleep(3)
    logging.info("4.")
    results = await asyncio.gather(*futures)
    logging.info("6.")

if __name__ == "__main__":
    asyncio.run(main())

expected output预计 output

2022-02-08 22:59:08,896 139673219430208 __init__ Using selector: EpollSelector
2022-02-08 22:59:08,896 139673219430208 main 1.
2022-02-08 22:59:08,897 139673194632960 long_task 2. t: 1
2022-02-08 22:59:08,897 139673186240256 long_task 2. t: 2
2022-02-08 22:59:08,897 139673219430208 main 3.
2022-02-08 22:59:09,898 139673194632960 long_task 5. t: 1
2022-02-08 22:59:10,898 139673186240256 long_task 5. t: 2
2022-02-08 22:59:13,400 139673219430208 main 4.
2022-02-08 22:59:09,898 139673194632960 long_task 2. t: 3
2022-02-08 22:59:10,899 139673186240256 long_task 2. t: 4
2022-02-08 22:59:12,902 139673194632960 long_task 5. t: 3
2022-02-08 22:59:14,903 139673186240256 long_task 5. t: 4
2022-02-08 22:59:14,903 139673219430208 main 6.

actual output实际 output

2022-02-08 22:59:08,896 139673219430208 __init__ Using selector: EpollSelector
2022-02-08 22:59:08,896 139673219430208 main 1.
2022-02-08 22:59:08,897 139673194632960 long_task 2. t: 1
2022-02-08 22:59:08,897 139673186240256 long_task 2. t: 2
2022-02-08 22:59:08,897 139673219430208 main 3.
2022-02-08 22:59:09,898 139673194632960 long_task 5. t: 1
2022-02-08 22:59:09,898 139673194632960 long_task 2. t: 3
2022-02-08 22:59:10,898 139673186240256 long_task 5. t: 2
2022-02-08 22:59:10,899 139673186240256 long_task 2. t: 4
2022-02-08 22:59:12,902 139673194632960 long_task 5. t: 3
2022-02-08 22:59:13,400 139673219430208 main 4.
2022-02-08 22:59:14,903 139673186240256 long_task 5. t: 4
2022-02-08 22:59:14,903 139673219430208 main 6.

Between 3 and 4, sleep(3) is being executed in the main thread.在 3 和 4 之间,sleep(3) 正在主线程中执行。 I understand that the end of longtask(1) and longtask(2) running earlier in Threadpool is printed during this time, but why is the next task running during this time?我明白Threadpool中早先运行的longtask(1)和longtask(2)的结束是在这段时间打印出来的,但是为什么下一个任务在这段时间运行呢? If event_loop is in the main thread, then sleep(3) should not allow the execution of the new function.如果 event_loop 在主线程中,那么 sleep(3) 应该不允许执行新的 function。

When using run_in_executor in asyncio, is the event loop executed in the main thread?在asyncio中使用run_in_executor时,事件循环是否在主线程中执行?

Yes, it is - but run_in_executor submits the callables to an executor, allowing them to run without assistance from the event loop.是的,它是 - 但run_in_executor将可调用对象提交给执行程序,允许它们在没有事件循环帮助的情况下运行。

Between 3 and 4, sleep(3) is being executed in the main thread.在 3 和 4 之间,sleep(3) 正在主线程中执行。 I understand that the end of longtask(1) and longtask(2) running earlier in Threadpool is printed during this time, but why is the next task running during this time?我明白Threadpool中早先运行的longtask(1)和longtask(2)的结束是在这段时间打印出来的,但是为什么下一个任务在这段时间运行呢? If event_loop is in the main thread, then sleep(3) should not allow the execution of the new function.如果 event_loop 在主线程中,那么 sleep(3) 应该不允许执行新的 function。

ThreadPoolExecutor(max_workers=2) creates a thread pool that can scale to up to two workers. ThreadPoolExecutor(max_workers=2)创建一个最多可扩展到两个工作线程的线程池。 run_in_executor is a wrapper around Executor.submit that ensures that the final result is propagated to asyncio. run_in_executorExecutor.submit的包装器,可确保将最终结果传播到 asyncio。 Its implementation could look like this (the actual code is a bit more complex because it handles cancellation and other concerns, but this is the gist):它的实现可能看起来像这样( 实际代码有点复杂,因为它处理取消和其他问题,但这是要点):

class EventLoop:
    # ...
    def run_in_executor(self, executor, f, *args):
        async_future = self.create_future()
        handle = executor.submit(f, *args)
        def when_done(_):
            self.call_soon_threadsafe(async_future.set_result, handle.result())
        handle.add_done_callback(when_done)
        return async_future

The call to submit pushes the callable and its arguments into a multi-threaded queue. submit调用将可调用对象及其 arguments 推入多线程队列。 The pool's workers run in an infinite loop that consumes that queue, exiting only when the executor is told to shut down .池中的工作人员在消耗该队列的无限循环中运行,仅在执行者被告知关闭时退出。

If you submit more tasks than there are workers in the pool, the additional tasks will still be placed in the queue, waiting for their turn to be processed.如果您提交的任务多于池中工作人员的数量,则额外的任务仍将放置在队列中,等待轮到它们进行处理。 (The queue is an unbounded channel, so Executor.submit() never blocks.) Once a worker is done with a task, it will request the next task off the queue, which is why your extra tasks get executed. (队列是一个无界通道,因此Executor.submit()永远不会阻塞。)一旦工作人员完成任务,它将请求队列中的下一个任务,这就是执行额外任务的原因。 It doesn't matter that the main thread is stuck in time.sleep() at that point - the functions were submitted to the executor prior to that, and are sitting in the queue, so the workers can get to them just fine.主线程此时卡在time.sleep()中并不重要 - 函数在此之前已提交给执行程序,并且位于队列中,因此工作人员可以很好地访问它们。


Finally, in normal asyncio code, an async function must never call time.sleep() , it must await asyncio.sleep() instead.最后,在普通的异步代码中,异步 function 绝不能调用 time.sleep time.sleep() ,它必须await asyncio.sleep() (I'm aware that you did it intentionally to block the thread running the event loop, but it something that beginners are often not aware of, so it needs to be pointed out.) (我知道你是故意阻止运行事件循环的线程,但初学者往往不知道这一点,所以需要指出。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM