简体   繁体   English

multiprocessing.Process 和 asyncio 循环通信

[英]multiprocessing.Process and asyncio loop communication

import asyncio
from multiprocessing import Queue, Process
import time

task_queue = Queue()

# This is simulating the task
async def do_task(task_number):
  for progress in range(task_number):
    print(f'{progress}/{task_number} doing')
    await asyncio.sleep(10)

# This is the loop that accepts and runs tasks
async def accept_tasks():
  event_loop = asyncio.get_event_loop()
  while True:
    task_number = task_queue.get() <-- this blocks event loop from running do_task()
    event_loop.create_task(do_task(task_number))

# This is the starting point of the process,
# the event loop runs here
def worker():
  event_loop = asyncio.get_event_loop()
  event_loop.run_until_complete(accept_tasks())

# Run a new process
Process(target=worker).start()

# Simulate adding tasks every 1 second
for _ in range(1,50):
  task_queue.put(_)
  print('added to queue', _)
  time.sleep(1)

I'm trying to run a separate process that runs an event loop to do I/O operations.我正在尝试运行一个单独的进程,该进程运行一个事件循环来执行 I/O 操作。 Now, from a parent process, I'm trying to "queue-in" tasks.现在,从父进程,我正在尝试“排队”任务。 The problem is that do_task() does not run.问题是 do_task() 没有运行。 The only solution that works is polling (ie checking if empty, then sleeping X seconds).唯一有效的解决方案是轮询(即检查是否为空,然后休眠 X 秒)。

After some researching, the problem seems to be that task_queue.get() isn't doing event-loop-friendly IO.经过一些研究,问题似乎是task_queue.get()没有做事件循环友好的 IO。

aiopipe provides a solution, but assumes both processes are running in an event loop. aiopipe提供了一个解决方案,但假设两个进程都在一个事件循环中运行。

I tried creating this.我试着创造这个。 But the consumer isn't consuming anything...但是消费者没有消费任何东西......

read_fd, write_fd = os.pipe()
consumer = AioPipeReader(read_fd)
producer = os.fdopen(write_fd, 'w')

A simple workaround for this situation is to change task_number = task_queue.get() to task_number = await event_loop.run_in_executor(None, task_queue.get) .这种情况的一个简单解决方法是将task_number = task_queue.get()更改为task_number = await event_loop.run_in_executor(None, task_queue.get) That way the blocking Queue.get() function will be off-loaded to a thread pool and the current coroutine suspended, as a good asyncio citizen.这样阻塞的Queue.get()函数将被卸载到线程池并且当前的协程被挂起,作为一个好的 asyncio 公民。 Likewise, once the thread pool finishes with the function, the coroutine will resume execution.同样,一旦线程池完成该函数,协程将恢复执行。

This approach is a workaround because it doesn't scale to a large number of concurrent tasks: each blocking call "turned async" that way will take a slot in the thread pool, and those that exceed the pool's maximum number of workers will not even start executing before a threed frees up.这种方法是一种变通方法,因为它不能扩展到大量并发任务:每个阻塞调用“变成异步”都会占用线程池中的一个插槽,而那些超过池的最大工作线程数的调用甚至不会在一个 Threed 释放之前开始执行。 For example, rewriting all of asyncio to call blocking functions through run_in_executor would just result in a badly written threaded system.例如,重写所有 asyncio 以通过run_in_executor调用阻塞函数只会导致编写错误的线程系统。 However, if you know that you have a small number of child processes, using run_in_executor is correct and can solve the problem very effectively.但是,如果您知道您有少量子进程,则使用run_in_executor是正确的,并且可以非常有效地解决问题。

I finally figured it out.我终于弄明白了。 There is a known way to do this with aiopipe library.有一种已知的方法可以使用aiopipe库来做到这aiopipe But it's made to run on two event loops on two different processes.但是它可以在两个不同进程的两个事件循环上运行。 In my case, I only have the child process running an event loop.就我而言,我只有子进程运行事件循环。 To solve that, I changed the writing part into a unbuffered normal write using open(fd, buffering=0) .为了解决这个问题,我使用open(fd, buffering=0)将写入部分更改为无缓冲的正常写入。

Here is the code without any library.这是没有任何库的代码。

import asyncio
from asyncio import StreamReader, StreamReaderProtocol
from multiprocessing import Process
import time
import os

# This is simulating the task
async def do_task(task_number):
  for progress in range(task_number):
    print(f'{progress}/{task_number} doing')
    await asyncio.sleep(1)

# This is the loop that accepts and runs tasks
async def accept_tasks(read_fd):
  loop = asyncio.get_running_loop()
  # Setup asynchronous reading
  reader = StreamReader()
  protocol = StreamReaderProtocol(reader)
  transport, _ = await loop.connect_read_pipe(
    lambda: protocol, os.fdopen(read_fd, 'rb', 0))

  while True:
      task_number = int(await reader.readline())
      await asyncio.sleep(1)
      loop.create_task(do_task(task_number))

  transport.close()

# This is the starting point of the process,
# the event loop runs here
def worker(read_fd):
  loop = asyncio.get_event_loop()
  loop.run_until_complete(accept_tasks(read_fd))

# Create read and write pipe
read_fd, write_fd = os.pipe()

# allow inheritance to child
os.set_inheritable(read_fd, True)
Process(target=worker, args=(read_fd, )).start()
# detach from parent
os.close(read_fd)

writer = os.fdopen(write_fd, 'wb', 0)
# Simulate adding tasks every 1 second
for _ in range(1,50):
  writer.write((f'{_}\n').encode())
  print('added to queue', _)
  time.sleep(1)

Basically, we use asynchronous reading on the child process' end, and do non-buffered synchronous write on the parent process' end.基本上,我们在子进程端使用异步读取,在父进程端进行非缓冲同步写入。 To do the former, you need to connect the event loop as shown in accept_tasks coroutine.要执行前者,您需要连接事件循环,如accept_tasks协程中所示。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM