简体   繁体   English

表单识别器自定义 model 失败,文件类型无效 `{"error":{"code":"1000","message":"Invalid input file."}}`

[英]Form Recognizer custom model fails with invalid file type `{"error":{"code":"1000","message":"Invalid input file."}}`

I have successfully trained a custom model for key value extraction, however any file or file type I use to evaluate the model is failing to return a result.我已经成功训练了自定义 model 用于键值提取,但是我用来评估 model 的任何文件或文件类型都无法返回结果。 So far I have tried both pdf and png files.到目前为止,我已经尝试了 pdf 和 png 文件。

I have matched the query provided in the API docs to create my query but it still fails, any suggestions?我已匹配API 文档中提供的查询来创建我的查询,但它仍然失败,有什么建议吗?

import requests
import json
import os
import pathlib

# path of file to evaluate
floc = 'path/to/file'

# extract file type
file_type = pathlib.Path(floc).suffix[1:]

# set headers with file type and our api key
headers = {
    'Content-Type': f'application/{file_type}',
    'Ocp-Apim-Subscription-Key': os.environ["AZURE_FORM_RECOGNIZER_KEY"]
}

# read in the file as binary to send
files = {'file': open(floc, 'rb')}

# post the file to be analysed
r = requests.post(
    f'https://eastus.api.cognitive.microsoft.com/formrecognizer/v2.1/custom/models/{os.environ["MODEL_ID"]}/analyze',
    headers=headers,
    files=files
)

r

The result is 400 with the following error:结果为400 ,错误如下:

{"error":{"code":"1000","message":"Invalid input file."}}

A very similar query using the layout/analyze request works perfectly.使用layout/analyze请求的非常相似的查询非常有效。 I have also read this question that has the same error but from cURL but it has not helped.我也读过这个有相同错误但来自 cURL 的问题,但它没有帮助。

I have fixed my problem but will leave my answer for any one else.我已经解决了我的问题,但会为其他人留下我的答案。

There were two main problems:有两个主要问题:

The fix is found below:修复如下:

import requests
import json
import os
import pathlib

# path of file to evaluate
floc = 'path/to/file'

# extract file type
file_type = pathlib.Path(floc).suffix[1:]

# set headers with file type and our api key
headers = {
    'Content-Type': f'application/{file_type}',
    'Ocp-Apim-Subscription-Key': os.environ["AZURE_FORM_RECOGNIZER_KEY"]
}

# post the file to be analysed
r = requests.post(
    f'{endpoint}/formrecognizer/v2.1/custom/models/{os.environ["MODEL_ID"]}/analyze',
    headers=headers, 
    data=open(floc, 'rb') # send binary of your file
)

r

You can find your own endpoint value by going on to the Azure instance for your form_recognizer:您可以通过转到 Azure 实例为您的 form_recognizer 找到您自己的endpoint值:

表单识别器实例信息

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Redshift - 错误:数字类型的输入语法无效 - Redshift - ERROR: Invalid input syntax for type numeric 从 pickle 文件加载 XGBoost model。 错误:“XGBClassifier”object 没有属性“use_label_encoder” - Loading XGBoost model from pickle file. Error: 'XGBClassifier' object has no attribute 'use_label_encoder' Azure 表单识别器仅分析 stream 中的第一个文件 - Azure Form Recognizer only analyzes the first file in a stream Terraform Apply Error: Code="InvalidResourceName" Message="Resource name is invalid" 创建 VPN-GW 模块时 - Terraform Apply Error: Code="InvalidResourceName" Message="Resource name is invalid" When creating VPN-GW Module Angular FirebaseError: [code=invalid-argument]: Expected type 'Va', but it was: a custom Pa object - Angular FirebaseError: [code=invalid-argument]: Expected type 'Va', but it was: a custom Pa object iOS 错误“JSON 写入 (FIRTimestamp) 中的类型无效” - iOS error 'Invalid type in JSON write (FIRTimestamp)' Microsoft Form Recognizer - 构建自定义 model 识别和提取椭圆形填充表单响应 - Microsoft Form Recognizer - Building a custom model that recognizes and extracts oval filled form responses 类型 json aws dms postgres 的无效输入语法 - invalid input syntax for type json aws dms postgres 无法解析私钥:错误:无效的 PEM 格式消息 - Failed to parse private key: Error: Invalid PEM formatted message YAML_FILE_ERROR 消息:预期命令 [0] 为字符串类型: - YAML_FILE_ERROR Message: Expected Commands[0] to be of string type:
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM