简体   繁体   中英

Form Recognizer custom model fails with invalid file type `{"error":{"code":"1000","message":"Invalid input file."}}`

I have successfully trained a custom model for key value extraction, however any file or file type I use to evaluate the model is failing to return a result. So far I have tried both pdf and png files.

I have matched the query provided in the API docs to create my query but it still fails, any suggestions?

import requests
import json
import os
import pathlib

# path of file to evaluate
floc = 'path/to/file'

# extract file type
file_type = pathlib.Path(floc).suffix[1:]

# set headers with file type and our api key
headers = {
    'Content-Type': f'application/{file_type}',
    'Ocp-Apim-Subscription-Key': os.environ["AZURE_FORM_RECOGNIZER_KEY"]
}

# read in the file as binary to send
files = {'file': open(floc, 'rb')}

# post the file to be analysed
r = requests.post(
    f'https://eastus.api.cognitive.microsoft.com/formrecognizer/v2.1/custom/models/{os.environ["MODEL_ID"]}/analyze',
    headers=headers,
    files=files
)

r

The result is 400 with the following error:

{"error":{"code":"1000","message":"Invalid input file."}}

A very similar query using the layout/analyze request works perfectly. I have also read this question that has the same error but from cURL but it has not helped.

I have fixed my problem but will leave my answer for any one else.

There were two main problems:

The fix is found below:

import requests
import json
import os
import pathlib

# path of file to evaluate
floc = 'path/to/file'

# extract file type
file_type = pathlib.Path(floc).suffix[1:]

# set headers with file type and our api key
headers = {
    'Content-Type': f'application/{file_type}',
    'Ocp-Apim-Subscription-Key': os.environ["AZURE_FORM_RECOGNIZER_KEY"]
}

# post the file to be analysed
r = requests.post(
    f'{endpoint}/formrecognizer/v2.1/custom/models/{os.environ["MODEL_ID"]}/analyze',
    headers=headers, 
    data=open(floc, 'rb') # send binary of your file
)

r

You can find your own endpoint value by going on to the Azure instance for your form_recognizer:

表单识别器实例信息

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM