简体   繁体   中英

How to run a blocking task asynchronously with ProcessPoolExecutor and asyncio?

Im trying to run a blocking task asynchronously with ProcessPoolExecutor (It works with ThreadPoolExecutor but I need ProcessPoolExecutor for CPU-bound task). Here is my code:


import asyncio
import time
from concurrent.futures import ProcessPoolExecutor
 
 
async def run_in_thread(task, *args):
    with ProcessPoolExecutor() as process_pool:
        loop = asyncio.get_event_loop()
        result = await loop.run_in_executor(process_pool, task, *args)
        return result
        
async def main_task():
    while True:
        await asyncio.sleep(1)
        print("ticker")

async def main():
    asyncio.create_task(main_task())

    global blocking_task
    def blocking_task():
        time.sleep(5)
        print("blocking task done!")
    await run_in_thread(blocking_task)
 
 
if __name__ == "__main__":
    asyncio.run(main())

And I get this error:

result = await loop.run_in_executor(process_pool, task, *args)
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.        

I don't understand where is the issue, can someone please help me? I'd also like to understand why it works with ThreadPoolExecutor but not ProcessPoolExecutor

I was expecting the code to print:

ticker
ticker
ticker
ticker
ticker
blocking task done!

Move the definition of blocking_task to the outer level of the module. As the script stands this function is invisible to other Processes. The code of the function isn't sent directly to the other Process, only its name . The other Process performs its own separate import of the script but the name isn't defined at the top level.

It's the same logic as if you tried to import this script into another script. Let's say this script is in a file named foo.py. After you do import foo , there is no function named foo.blocking_task so you would be unable to call it.

This is a little bit more clear if you looked at the whole traceback, instead of just the last line.

Incidentally, using the global statement in front of the function definition isn't the same thing as moving the definition to the top level. In your script the name blocking_task does not exist at module level until the main() function actually runs (which the secondary Process never does). In the working script below, the name blocking_task exists as soon as the module is imported.

import asyncio
import time
from concurrent.futures import ProcessPoolExecutor
 
 
async def run_in_thread(task, *args):
    with ProcessPoolExecutor() as process_pool:
        loop = asyncio.get_event_loop()
        result = await loop.run_in_executor(process_pool, task, *args)
        return result
        
async def main_task():
    while True:
        await asyncio.sleep(1)
        print("ticker")

def blocking_task():
    time.sleep(5)
    print("blocking task done!")
        
async def main():
    asyncio.create_task(main_task())
    await run_in_thread(blocking_task)
 
if __name__ == "__main__":
    asyncio.run(main())

This prints exactly what you were expecting.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM