简体   繁体   English

使 FastAPI WebSockets 的 CPU 绑定任务异步

[英]Make an CPU-bound task asynchronous for FastAPI WebSockets

so I have a CPU-bound long-running algorithm, let's call it task.所以我有一个受 CPU 限制的长时间运行算法,我们称之为任务。 Let's say it looks like this:假设它看起来像这样:

def task(parameters):
  result = 0
  for _ in range(10):
    for _ in range(10):
      for _ in range(10):
        result += do_things()
  return result

@app.get('/')
def results(parameters: BodyModel):
    return task(parameters)

If I encapsulate that in a def path operation function everything works fine as it is started in a different thread.如果我将其封装在def路径操作 function中,则一切正常,因为它是在不同的线程中启动的。 I can access multiple paths etc. concurrency is doing its job by pushing my CPU-bound task to a separate thread.我可以访问多个路径等。并发通过将我的 CPU 绑定任务推送到单独的线程来完成它的工作。 But I want to switch to WebSockets now, to communicate intermediate results.但我现在想切换到 WebSockets,以传达中间结果。 For that to work, I have to mark my whole thing as asynchronous and pass the WebSocket into my task.为此,我必须将我的整个事情标记为异步并将 WebSocket 传递给我的任务。 So it looks like this:所以它看起来像这样:

async def task(parameters):
  result = 0
  for _ in range(10):
    for _ in range(10):
      for _ in range(10):
        intermediate_result = do_things()
        await parameters.websocket.send_text(intermediate_result)
        result += intermediate_result
  return result

@app.websocket("/ws")
async def websocket_endpoint(websocket: WebSocket):
    await websocket.accept()
    while True:
        parameters = await websocket.receive_text()
        parameters.websocket = websocket
        result = await task(parameters)
        await websocket.send_text(result)

It works like a charm to send the intermediate results.发送中间结果就像一个魅力。 BUT now my algorithm blocks FastAPI as it is not truly asynchronous by itself.但是现在我的算法阻止了 FastAPI,因为它本身并不是真正的异步。 Once I post a message to '/ws' FastAPI is blocked and does not respond to any other requests until my task is finished.一旦我向“/ws”发布消息,FastAPI 就会被阻止,并且在我的任务完成之前不会响应任何其他请求。

So I need some advice on how to所以我需要一些关于如何

  • a) either send WebSocket messages from within a synchronous CPU-bound task (I didn't find a synchronous send_text alternative) so I can use def or a)要么从同步 CPU 绑定任务中发送 WebSocket 消息(我没有找到同步 send_text 替代方案),所以我可以使用def
  • b) how to make my CPU-bound truly asynchronous so that it does not block anything anymore when I use async def . b)如何使我的 CPU 绑定真正异步,以便在我使用async def时它不再阻塞任何东西。

I tried using the ProcessPoolExecuter as described here but it's not possible to pickle a coroutine and as far as I understand I have to make my task a coroutine (using async) to use the websocket.send_text() within it.我尝试使用此处描述的 ProcessPoolExecuter,但不可能腌制协程,据我所知,我必须使我的任务成为协程(使用异步)才能在其中使用websocket.send_text()

Also, I thought about just storing my intermediate results somewhere, make an HTTP POST to start my task, and then have another WebSocket connection to read and send the intermediate results.另外,我想把我的中间结果存储在某个地方,做一个 HTTP POST 来开始我的任务,然后有另一个 WebSocket 连接来读取和发送中间结果。 But then I could also similarly start a background task and implement a regular HTTP polling mechanism.但是我也可以类似地启动后台任务并实现常规的 HTTP 轮询机制。 But I don't want either, mainly because I plan to use Google Cloud Run which limits the CPU when all connections are closed.但我也不想要,主要是因为我打算使用谷歌云运行,它会在所有连接关闭时限制 CPU。 And I think it's better practice to teach my task how to communicate via WebSocket directly.而且我认为最好的做法是教我的任务如何直接通过 WebSocket 进行通信。

I hope my question is clear.我希望我的问题很清楚。 It's my first larger-scale project with FastAPI and asynchronicity and haven't really used AsyncIO before.这是我第一个使用 FastAPI 和异步性的大型项目,之前没有真正使用过 AsyncIO。 So I might have just missed something.所以我可能只是错过了一些东西。 Thx for your suggestions.谢谢你的建议。

In case someone comes across this, I'll add the solution that works for me now.如果有人遇到这种情况,我将添加现在适合我的解决方案。

I was following this: https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.loop.run_in_executor我在关注这个: https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.loop.run_in_executor

The key is to make it non-blocking.关键是让它不阻塞。 So for example, instead of:因此,例如,而不是:

 # 1. Run in the default loop's executor:
result = await loop.run_in_executor(None, blocking_io)
print('default thread pool', result)

I move the await and change the code to:我移动等待并将代码更改为:

# 1. Run in the default loop's executor:
thread = loop.run_in_executor(None, blocking_io)
print('default thread pool', result)
while True:
    asyncio.sleep(1)
    websocket.send_text('status updates...'
    if internal_logger.blocking_stuff_finished:
        break
result = await thread
websocket.send_text('result:', result)
websocket.close()

This way I have my cpu_bound stuff in a separate thread that I'm not awaiting for and everything works fine.这样,我将 cpu_bound 的东西放在一个单独的线程中,我不等待,并且一切正常。

Making a custom thread pool also works, but we would need to remove the context manager to make it non-blocking.制作自定义线程池也可以,但我们需要删除上下文管理器以使其非阻塞。

# 2. Run in a custom thread pool:
with concurrent.futures.ThreadPoolExecutor() as pool:
    result = await loop.run_in_executor(pool, blocking_io)

would then become:然后会变成:

pool = concurrent.futures.ThreadPoolExecutor()
thread = loop.run_in_executor(pool, blocking_io)

In theory, the same would work for the ProcessPoolExecutor, but more work needs to be done as there is no shared memory and my termination condition wouldn't work as described above.从理论上讲,ProcessPoolExecutor 也是如此,但需要做更多的工作,因为没有共享的 memory 并且我的终止条件不会像上面描述的那样工作。

And yes, I know that cpu_bound stuff should preferably be done in a different process, but moving it to a separate thread does not work in my case and I do enjoy the shared memory atm.是的,我知道 cpu_bound 的东西最好在不同的进程中完成,但是在我的情况下将它移动到单独的线程不起作用,我确实喜欢共享的 memory atm。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM