简体   繁体   English

调用慢速Python CGI脚本时,如何避免网页收到网关超时?

[英]How can I avoid webpage receiving Gateway Timeout when calling slow Python CGI script?

I have a LAMP server set up in EC2. 我在EC2中设置了一个LAMP服务器。 A simple website hosted on this web server in /var/www/html/ allows a user to upload an audio file of people having a discussion via an input form: 在此网络服务器上的/var/www/html/托管的一个简单网站允许用户通过输入表单上载正在讨论的人的音频文件:

<form action="../cgi-bin/store_mp3_view" method="post" accept-charset="utf-8" enctype="multipart/form-data">
    <label for="mp3">Audio file</label>
    <input type="file" name="filename" />
    <input type="submit" value="Upload" />
</form>

This audio file gets stored in /tmp/ . 该音频文件存储在/tmp/ As you can see, this triggers a Python script I have in cgi-bin. 如您所见,这触发了我在cgi-bin中拥有的Python脚本。 Here is the script: http://pastebin.com/iNU6WSUV . 这是脚本: http : //pastebin.com/iNU6WSUV This script then uploads the uploaded audio file from my web server to an API by Honda which will detect utterances and produce an audio file for each utterance as well as a json object containing metadata for each utterance. 然后,此脚本将从我的Web服务器上载的音频文件上传到Honda的API,该API将检测发音并为每个发音生成一个音频文件,以及一个包含每个发音元数据的json对象。 It appears the utterance files can be fetched separately, as well as the json for each utterance from Hondas API: https://api.hark.jp/docs/en/05_reference_webapi.html . 似乎可以单独获取语音文件,也可以从Hondas API中获取每种语音的json: https ://api.hark.jp/docs/en/05_reference_webapi.html。 My script waits for all of this processing to complete (all utterances to be processed and ready), then retrieves each audio file and sends it to Bing Speech API to get the text from speech. 我的脚本等待所有这些处理完成(所有话语要处理并准备就绪),然后检索每个音频文件并将其发送到Bing Speech API以从语音中获取文本。 This is because I want to play each utterance audio file and associated text and metadata in the browser as the conversation happened in sequence/real-time. 这是因为当会话按顺序/实时发生时,我想在浏览器中播放每个话语音频文件以及关联的文本和元数据。 A player, if you will. 一个玩家,如果愿意的话。 The problem is all of this takes too long, as the browser is receiving a gateway timeout from the cgi script. 问题是所有这些都花费了很长时间,因为浏览器正在从cgi脚本接收网关超时。 It can take several minutes. 可能要花几分钟。 Specifically, Hark takes a while to return the complete results of the audio analysis, but it appears I can query their API and retrieve intermediate results as mentioned earlier. 具体来说,Hark需要一段时间才能返回音频分析的完整结果,但是看来我可以查询其API并检索中间结果,如前所述。 However, the utterances don't finish in order, so utterance 3 may be ready before utterance 2, but I need to show 2 before 3 because conversations have an order of utterances. 但是,语音未按顺序结束,因此语音3可能在语音2之前就已经准备好了,但是我需要在语音3之前显示2,因为对话具有一定的语音顺序。 What is the best way to go about building an app that can do this? 构建能够做到这一点的应用程序的最佳方法是什么? How can I background these API calls to not block and cause a timeout? 如何才能使这些API调用后台化,从而不会阻塞并导致超时? Should I be using something like Flask for this web app? 我应该在此Web应用程序中使用Flask之类的东西吗? How can I render the results in the webpage as I iteratively poll and retrieve them from Hark? 反复轮询并从Hark检索结果时,如何在网页中呈现结果? Is CGI the wrong tool for the job? CGI是工作的错误工具吗? Thanks. 谢谢。

Generally the way to handle long delay is using yield and sending partial data to client. 通常,处理长时延的方法是使用yield并将部分数据发送给客户端。 Instead of obj.wait() you need a loop to check if status is finished and if not printing something like: ... and sleep for one second. 而不是obj.wait()您需要一个循环来检查状态是否已完成,以及是否未打印如下内容: ...并休眠一秒钟。 This way you will not receive timeout. 这样,您将不会收到超时。

While Ali Nikneshans answer was helpful, it seems CGI is not the right tool for the job. 尽管Ali Nikneshans的回答很有帮助,但CGI似乎并不是完成这项工作的正确工具。 I decided to stop using a LAMP stack/CGI apps and setup a Tornado web server with web sockets, which allows me to do async calls easily, background tasks, and use coroutines to setup a data pipeline for polling the API endpoint and feeding the data into the browser. 我决定停止使用LAMP堆栈/ CGI应用程序,并设置具有Web套接字的Tornado Web服务器,这使我可以轻松进行异步调用,后台任务,并使用协程来设置数据管道以轮询API端点并提供数据进入浏览器。

This presentation was quite helpful for understanding coroutines: 该演示文稿对理解协程非常有帮助:

http://www.dabeaz.com/coroutines/Coroutines.pdf . http://www.dabeaz.com/coroutines/Coroutines.pdf

And for Tornado: 对于龙卷风:

http://www.tornadoweb.org/en/stable/index.html . http://www.tornadoweb.org/en/stable/index.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM