简体   繁体   English

如何使用 ProcessPoolExecutor 和 asyncio 异步运行阻塞任务?

[英]How to run a blocking task asynchronously with ProcessPoolExecutor and asyncio?

Im trying to run a blocking task asynchronously with ProcessPoolExecutor (It works with ThreadPoolExecutor but I need ProcessPoolExecutor for CPU-bound task).我正在尝试使用 ProcessPoolExecutor 异步运行阻塞任务(它与 ThreadPoolExecutor 一起使用,但我需要 ProcessPoolExecutor 来执行 CPU 绑定任务)。 Here is my code:这是我的代码:


import asyncio
import time
from concurrent.futures import ProcessPoolExecutor
 
 
async def run_in_thread(task, *args):
    with ProcessPoolExecutor() as process_pool:
        loop = asyncio.get_event_loop()
        result = await loop.run_in_executor(process_pool, task, *args)
        return result
        
async def main_task():
    while True:
        await asyncio.sleep(1)
        print("ticker")

async def main():
    asyncio.create_task(main_task())

    global blocking_task
    def blocking_task():
        time.sleep(5)
        print("blocking task done!")
    await run_in_thread(blocking_task)
 
 
if __name__ == "__main__":
    asyncio.run(main())

And I get this error:我得到这个错误:

result = await loop.run_in_executor(process_pool, task, *args)
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.        

I don't understand where is the issue, can someone please help me?我不明白问题出在哪里,有人可以帮助我吗? I'd also like to understand why it works with ThreadPoolExecutor but not ProcessPoolExecutor我还想了解为什么它适用于 ThreadPoolExecutor 而不是 ProcessPoolExecutor

I was expecting the code to print:我期待代码打印:

ticker
ticker
ticker
ticker
ticker
blocking task done!

Move the definition of blocking_task to the outer level of the module.将 blocking_task 的定义移动到模块的外层。 As the script stands this function is invisible to other Processes.正如脚本所示,此功能对其他进程不可见。 The code of the function isn't sent directly to the other Process, only its name .该函数的代码不会直接发送给其他进程,只会发送其名称 The other Process performs its own separate import of the script but the name isn't defined at the top level.另一个进程执行自己单独的脚本导入,但名称未在顶层定义。

It's the same logic as if you tried to import this script into another script.这与您尝试将此脚本导入另一个脚本的逻辑相同。 Let's say this script is in a file named foo.py.假设此脚本位于名为 foo.py 的文件中。 After you do import foo , there is no function named foo.blocking_task so you would be unable to call it.在执行import foo之后,没有名为foo.blocking_task的函数,因此您将无法调用它。

This is a little bit more clear if you looked at the whole traceback, instead of just the last line.如果您查看整个回溯,而不仅仅是最后一行,这会更清楚一点。

Incidentally, using the global statement in front of the function definition isn't the same thing as moving the definition to the top level.顺便说一句,在函数定义前面使用全局语句与将定义移到顶层不是一回事。 In your script the name blocking_task does not exist at module level until the main() function actually runs (which the secondary Process never does).在您的脚本中,名称blocking_task在 main() 函数实际运行之前在模块级别不存在(辅助进程从不运行)。 In the working script below, the name blocking_task exists as soon as the module is imported.在下面的工作脚本中,名称blocking_task在导入模块后立即存在。

import asyncio
import time
from concurrent.futures import ProcessPoolExecutor
 
 
async def run_in_thread(task, *args):
    with ProcessPoolExecutor() as process_pool:
        loop = asyncio.get_event_loop()
        result = await loop.run_in_executor(process_pool, task, *args)
        return result
        
async def main_task():
    while True:
        await asyncio.sleep(1)
        print("ticker")

def blocking_task():
    time.sleep(5)
    print("blocking task done!")
        
async def main():
    asyncio.create_task(main_task())
    await run_in_thread(blocking_task)
 
if __name__ == "__main__":
    asyncio.run(main())

This prints exactly what you were expecting.这将打印出您所期望的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM