简体   繁体   English

Tika在Azure Python Flask上抛出错误500

[英]Tika on Azure Python Flask throwing error 500

i developed this simple parsing tool using Tika, python-flask. 我使用Tika,python-flask开发了这个简单的解析工具。 I deployed in Azure (Python- Flask Webb APP). 我部署在Azure(Python- Flask Webb APP)中。 The simple app works fine on local machine and it loads fine in Azure, however it throws error 500 specifically when the program calls Tika.I did install Tika 1.18 from requirements.txt 这个简单的应用程序可以在本地计算机上正常运行,并且可以在Azure上很好地加载,但是特别是在程序调用Tika时会引发错误500.我确实从requirements.txt安装了Tika 1.18

The webserver folder where the file to be parsed is temporary stored seems to be accessible as other parts of the code can read, delete the same folder and file. 由于代码的其他部分可以读取,删除相同的文件夹和文件,因此似乎可以访问要临时存储要分析的文件的webserver文件夹。 I read in few online articles that the issue could be that the tika-server.jar virtual machine may not automatically instantiate in azure... 我在几篇在线文章中读到,问题可能是tika-server.jar虚拟机可能无法在Azure中自动实例化...

Below is an extract of the view.py code and the error from KUDU and WSGI logs. 以下摘录了view.py代码以及KUDU和WSGI日志中的错误。

Thanks Jiro views. 感谢次郎的意见。 py extract: py提取物:

import tika
from tika import unpack
from werkzeug.utils import secure_filename

@app.route('/upload', methods=['GET', 'POST'])
@login_required
    def upload():
      error=None
      if request.method=="POST":
        form = UploadForm()
        file = form.upload_file.data
        if file:
          dirname = os.path.dirname(__file__)
          target = os.path.join(dirname, 'uploads')
          filename = secure_filename(file.filename)
          destination = os.path.join(target, filename)               
          file.save(destination)
          parsed_file = unpack.from_file(destination) 
          parsed_content = parsed_file["content"]
          parsed_content = ' '.join(parsed_content.split())
          parsed_content= parsed_content.encode('ascii','ignore').decode('ascii')
          form.input_area.data = parsed_content
          os.remove(destination)
          return render_template ("upload.html", form=form, error=error)
      else:
          error = "Select a File to Parse"

KUDU error: KUDU错误:

HTTP Error 500.0 - Internal Server Error The page cannot be displayed because an internal server error has occurred. HTTP错误500.0-内部服务器错误该页面无法显示,因为发生了内部服务器错误。

Most likely causes: IIS received the request; 最可能的原因:IIS收到请求; however, an internal error occurred during the processing of the request. 但是,在处理请求期间发生内部错误。 The root cause of this error depends on which module handles the request and what was happening in the worker process when this error occurred. 该错误的根本原因取决于哪个模块处理请求以及发生此错误时工作进程中发生的情况。 IIS was not able to access the web.config file for the Web site or application. IIS无法访问网站或应用程序的web.config文件。 This can occur if the NTFS permissions are set incorrectly. 如果NTFS权限设置不正确,可能会发生这种情况。 IIS was not able to process configuration for the Web site or application. IIS无法处理网站或应用程序的配置。 The authenticated user does not have permission to use this DLL. 经过身份验证的用户没有使用此DLL的权限。 The request is mapped to a managed handler but the .NET Extensibility Feature is not installed. 该请求已映射到托管处理程序,但未安装.NET扩展功能。

Things you can try: Ensure that the NTFS permissions for the web.config file are correct and allow access to the Web server's machine account. 您可以尝试的操作:确保web.config文件的NTFS权限正确,并允许访问Web服务器的计算机帐户。 Check the event logs to see if any additional information was logged. 检查事件日志以查看是否记录了任何其他信息。 Verify the permissions for the DLL. 验证DLL的权限。 Install the .NET Extensibility feature if the request is mapped to a managed handler. 如果请求已映射到托管处理程序,请安装.NET可扩展性功能。 Create a tracing rule to track failed requests for this HTTP status code. 创建跟踪规则以跟踪对此HTTP状态代码的失败请求。 For more information about creating a tracing rule for failed requests, click here. 有关为失败的请求创建跟踪规则的更多信息,请单击此处。

Detailed Error Information: Module FastCgiModule Notification ExecuteRequestHandler Handler PythonHandler Error Code 0x00000000 详细的错误信息:模块FastCgiModule通知ExecuteRequestHandler处理程序PythonHandler错误代码0x00000000

WSGI Logs: WSGI log WSGI日志: WSGI日志

According to the logs, the issue is in connecting to the Tika server. 根据日志,问题在于连接到Tika服务器。 Consider breaking this into two issues: 考虑将其分为两个问题:

  1. Intalling Tika to Azure, verifying that it is running ( These instructions might help). 将Tika安装到Azure,确认它正在运行( 这些说明可能会有所帮助)。

  2. Configuring the Python application to connect to the Tika server (eg via environmental variables ) or by specifying the ServerEndpoint explicitly in unpack() call. 配置Python应用程序以连接到Tika服务器(例如,通过环境变量 ),或者通过在unpack()调用中显式指定ServerEndpoint。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM