简体   繁体   English

Python run_in_executor而忘了?

[英]Python run_in_executor and forget?

How can I set a blocking function to be run in a executor, in a way that the result doesn't matter, so the main thread shouldn't wait or be slowed by it. 如何设置阻塞函数以在执行程序中运行,结果无关紧要,因此主线程不应等待或减慢它。

To be honest I'm not sure if this is even the right solution for it, all I want is to have some type of processing queue separated from the main process so that it doesn't block the server application from returning requests, as this type of web server runs one worker for many requests. 老实说,我不确定这对于它是否是正确的解决方案,我想要的是将某些类型的处理队列与主进程分开,以便它不会阻止服务器应用程序返回请求,因为这Web服务器的类型为许多请求运行一个worker。

Preferably I would like to keep away from solutions like Celery, but if that's the most optimal I would be willing to learn it. 我希望远离像Celery这样的解决方案,但如果这是最优化的,我愿意学习它。

The context here is a async web server that generates pdf files with large images. 这里的上下文是一个异步Web服务器,它生成包含大图像的pdf文件。

app = Sanic()
#App "global" worker
executor = ProcessPoolExecutor(max_workers=5)

app.route('/')
async def getPdf(request):
  asyncio.create_task(renderPdfsInExecutor(request.json))
  #This should be returned "instantly" regardless of pdf generation time
  return response.text('Pdf being generated, it will be sent to your email when done')

async def renderPdfsInExecutor(json):
  asyncio.get_running_loop.run_in_executor(executor, syncRenderPdfs, json)

def syncRenderPdfs(json)
  #Some PDF Library that downloads images synchronously
  pdfs = somePdfLibrary.generatePdfsFromJson(json)
  sendToDefaultMail(pdfs)

The above code gives the error (Yes, it is running as admin) : 上面的代码给出了错误(是的,它以管理员身份运行):

PermissionError [WinError 5] Access denied
Future exception was never retrieved

Bonus question: Do I gain anything by running a asyncio loop inside the executor? 奖金问题:我是否通过在执行程序中运行asyncio循环获得任何收益? So that if it is handling several PDF requests at once it will distribute the processing between them. 因此,如果它一次处理多个PDF请求,它将在它们之间分配处理。 If yes, how do I do it? 如果是,我该怎么办?

Ok, so first of all there is a misunderstanding. 好的,首先是误会。 This 这个

async def getPdf(request):
    asyncio.create_task(renderPdfsInExecutor(request.json))
    ...

async def renderPdfsInExecutor(json):
    asyncio.get_running_loop.run_in_executor(executor, syncRenderPdfs, json)

is redundant. 是多余的。 It is enough to do 这足够了

async def getPdf(request):
    asyncio.get_running_loop.run_in_executor(executor, syncRenderPdfs, request.json)
    ...

or (since you don't want to await) even better 或者(因为你不想等待)甚至更好

async def getPdf(request):
    executor.submit(syncRenderPdfs, request.json)
    ...

Now the problem you get is because syncRenderPdfs throws PermissionError . 现在你遇到的问题是因为syncRenderPdfs抛出PermissionError It is not handled so Python warns you "Hey, some background code threw an error. But the code is not owned by anyone so what the heck?". 它没有被处理,因此Python警告你“嘿,一些后台代码引发了错误。但是代码并不是由任何人拥有的,所以到底是什么?”。 That's why you get Future exception was never retrieved . 这就是为什么你Future exception was never retrievedFuture exception was never retrieved You have a problem with the pdf library itself , not with asyncio. 你有一个pdf库本身的问题 ,而不是asyncio。 Once you fix that inner problem it is also a good idea to be safe: 一旦你解决了这个内心问题,保证安全也是个好主意:

def syncRenderPdfs(json)
    try:
        #Some PDF Library that downloads images synchronously
        pdfs = somePdfLibrary.generatePdfsFromJson(json)
        sendToDefaultMail(pdfs)
    except Exception:
        logger.exception('Something went wrong')  # or whatever

Your "permission denied" issue is a whole different thing and you should debug it and/or post a separate question for that. 您的“许可被拒绝”问题是完全不同的事情,您应该调试它和/或为此发布一个单独的问题。

As for the final question: yes, executor will queue and evenly distribute tasks between workers. 至于最后一个问题:是的,执行者将在工人之间排队并均匀分配任务。

EDIT: As we've talked in comments the actual problem might be with the Windows environment you work on. 编辑:正如我们在评论中所说的那样,实际问题可能出在您所使用的Windows环境中。 Or more precisely with the ProcessPoolExecutor, ie spawning processes may change permissions. 或者更准确地说,使用ProcessPoolExecutor,即产生进程可能会更改权限。 I advice using ThreadPoolExecutor, assuming it works fine on the platform. 我建议使用ThreadPoolExecutor,假设它在平台上工作正常。

您可以查看asyncio.gather(* tasks)以并行运行多个。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM