简体   繁体   中英

Tika on Azure Python Flask throwing error 500

i developed this simple parsing tool using Tika, python-flask. I deployed in Azure (Python- Flask Webb APP). The simple app works fine on local machine and it loads fine in Azure, however it throws error 500 specifically when the program calls Tika.I did install Tika 1.18 from requirements.txt

The webserver folder where the file to be parsed is temporary stored seems to be accessible as other parts of the code can read, delete the same folder and file. I read in few online articles that the issue could be that the tika-server.jar virtual machine may not automatically instantiate in azure...

Below is an extract of the view.py code and the error from KUDU and WSGI logs.

Thanks Jiro views. py extract:

import tika
from tika import unpack
from werkzeug.utils import secure_filename

@app.route('/upload', methods=['GET', 'POST'])
@login_required
    def upload():
      error=None
      if request.method=="POST":
        form = UploadForm()
        file = form.upload_file.data
        if file:
          dirname = os.path.dirname(__file__)
          target = os.path.join(dirname, 'uploads')
          filename = secure_filename(file.filename)
          destination = os.path.join(target, filename)               
          file.save(destination)
          parsed_file = unpack.from_file(destination) 
          parsed_content = parsed_file["content"]
          parsed_content = ' '.join(parsed_content.split())
          parsed_content= parsed_content.encode('ascii','ignore').decode('ascii')
          form.input_area.data = parsed_content
          os.remove(destination)
          return render_template ("upload.html", form=form, error=error)
      else:
          error = "Select a File to Parse"

KUDU error:

HTTP Error 500.0 - Internal Server Error The page cannot be displayed because an internal server error has occurred.

Most likely causes: IIS received the request; however, an internal error occurred during the processing of the request. The root cause of this error depends on which module handles the request and what was happening in the worker process when this error occurred. IIS was not able to access the web.config file for the Web site or application. This can occur if the NTFS permissions are set incorrectly. IIS was not able to process configuration for the Web site or application. The authenticated user does not have permission to use this DLL. The request is mapped to a managed handler but the .NET Extensibility Feature is not installed.

Things you can try: Ensure that the NTFS permissions for the web.config file are correct and allow access to the Web server's machine account. Check the event logs to see if any additional information was logged. Verify the permissions for the DLL. Install the .NET Extensibility feature if the request is mapped to a managed handler. Create a tracing rule to track failed requests for this HTTP status code. For more information about creating a tracing rule for failed requests, click here.

Detailed Error Information: Module FastCgiModule Notification ExecuteRequestHandler Handler PythonHandler Error Code 0x00000000

WSGI Logs: WSGI log

According to the logs, the issue is in connecting to the Tika server. Consider breaking this into two issues:

  1. Intalling Tika to Azure, verifying that it is running ( These instructions might help).

  2. Configuring the Python application to connect to the Tika server (eg via environmental variables ) or by specifying the ServerEndpoint explicitly in unpack() call.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM