[英]Azure Computer Vision : Recognize Printed Text
I'm using Azure computer vision with nodejs, and I would to extract text on the images, it works as expected but I'm facing some challenges: the code:我正在使用带有nodejs的Azure计算机视觉,我会提取图像上的文本,它按预期工作,但我面临一些挑战:代码:
'use strict';
const request = require('request');
const subscriptionKey = 'key';
const endpoint = 'endpoint'
var uriBase = endpoint + 'vision/v3.1/ocr';
const imageUrl = 'https://livesimply.me/wp-content/uploads/2015/09/foods-to-avoid-real-food-3036-2-1024x683.jpg';
// Request parameters.
const params = {
'language': 'unk',
'detectOrientation': 'true',
};
const options = {
uri: uriBase,
qs: params,
body: '{"url": ' + '"' + imageUrl + '"}',
headers: {
'Content-Type': 'application/json',
'Ocp-Apim-Subscription-Key' : subscriptionKey
}
};
request.post(options, (error, response, body) => {
if (error) {
console.log('Error: ', error);
return;
}
let jsonResponse = JSON.stringify(JSON.parse(body), null, ' ');
console.log('JSON Response\n');
console.log(jsonResponse);
});
the output: output:
"regions": [
{
"boundingBox": "0,191,277,281",
"lines": [
{
"boundingBox": "53,191,23,49",
"words": [
{
"boundingBox": "53,191,23,49",
"text": "in"
}
]
},
{
"boundingBox": "0,285,277,82",
"words": [
{
"boundingBox": "0,285,150,82",
"text": ")arb.0g"
},
{
"boundingBox": "214,288,63,63",
"text": "0%"
}
]
},
{
"boundingBox": "14,393,45,79",
"words": [
{
"boundingBox": "14,393,45,79",
"text": "Og"
}
]
},
{
"boundingBox": "213,394,63,63",
"words": [
{
"boundingBox": "213,394,63,63",
"text": "00/0"
}
]
}
]
},
{
"boundingBox": "322,184,352,457",
"lines": [
{
"boundingBox": "326,184,348,54",
"words": [
{
"boundingBox": "326,184,239,52",
"text": "INGREDIENTS:"
},
{
"boundingBox": "588,188,86,50",
"text": "WHITE"
}
]
},
{
"boundingBox": "325,248,281,59",
"words": [
{
"boundingBox": "325,248,83,56",
"text": "TUNA,"
},
{
"boundingBox": "417,250,127,51",
"text": "SOYBEAN"
},
{
"boundingBox": "555,252,51,55",
"text": "OIL,"
}
]
},
{
"boundingBox": "324,313,341,60",
"words": [
{
"boundingBox": "324,313,155,52",
"text": "VEGETABLE"
},
{
"boundingBox": "489,316,101,56",
"text": "BROTH,"
},
{
"boundingBox": "598,317,67,56",
"text": "SALT,"
}
]
},
{
"boundingBox": "324,378,334,53",
"words": [
{
"boundingBox": "324,378,235,52",
"text": "PYROPHOSPHATE"
},
{
"boundingBox": "566,381,92,50",
"text": "ADDED"
}
]
},
{
"boundingBox": "323,519,248,52",
"words": [
{
"boundingBox": "323,519,193,51",
"text": "DISTRIBUTED"
},
{
"boundingBox": "528,521,43,50",
"text": "BY:"
}
]
},
{
"boundingBox": "322,584,298,57",
"words": [
{
"boundingBox": "322,584,124,50",
"text": "BUMBLE"
},
{
"boundingBox": "457,585,52,50",
"text": "BEE"
},
{
"boundingBox": "519,585,101,56",
"text": "FOODS,"
}
]
}
]
},
{
"boundingBox": "791,400,198,117",
"lines": [
{
"boundingBox": "921,400,68,45",
"words": [
{
"boundingBox": "921,400,68,45",
"text": ",11."
}
]
},
{
"boundingBox": "791,464,128,53",
"words": [
{
"boundingBox": "791,464,75,53",
"text": "PRC:"
},
{
"boundingBox": "874,467,45,48",
"text": "x"
}
]
}
]
}
]
}
but I'm facing some challenges with this code:但我在这段代码中面临一些挑战:
Thanks for you help experts.谢谢各位高手的帮助。
There is a brand new online portal provided by Microsoft to test this service, among others and input requirements for Read API. Microsoft 提供了一个全新的在线门户来测试此服务以及读取 API 的输入要求。 The output provided as string and JSON.
output 作为字符串和 JSON 提供。
Link: https://preview.vision.azure.com/demo/OCR链接: https://preview.vision.azure.com/demo/OCR
if you need to extract key value pairs you can train a custom model using Form Recognizer.如果您需要提取键值对,您可以使用表单识别器训练自定义 model。
• Custom – Extract key value pairs trained on your own documents • 自定义——提取在您自己的文档上训练的键值对
Here is link to custom model using Form Recognizer.这是使用表单识别器自定义 model 的链接。
We extract printed text with optical character recognition (OCR) from an image using the Computer Vision REST API.我们使用计算机视觉 REST API 从图像中提取带有光学字符识别 (OCR) 的印刷文本。 And a successful response is returned in JSON.
并在 JSON 中返回成功响应。 You can't get a direct string output form this Azure Cognitive Service.
您无法从此 Azure 认知服务中获得直接字符串 output。
For the problem -对于问题 -
I want the output as a string and not JSON tree.
我想要 output 作为字符串而不是 JSON 树。
We can't directly print the ingredients like a string as seen in the image.我们不能像图像中看到的那样直接打印像字符串一样的成分。 To extract the content and display it in particular format, after you get the JSON string, parse that into a JSON object and run a loop to extract data from it.
要提取内容并以特定格式显示,在获得 JSON 字符串后,将其解析为 JSON object 并运行循环以从中提取数据。 After that use the split function to get the data stored into arrays .
之后使用拆分 function 获取存储到 arrays 的数据。 As shown in the below snippet.
如以下代码段所示。
function(error, response, body){
if(error) {
console.log(error);
} else {
//parsing the JSON string
var jsonObj = JSON.parse(body);
var ob = jsonObj;
//running loop to extract the text values
for(i=0;i<....){
for(j=0;j<....){
for(k=0;k<....){
var str = str + " "+ob.....text;
}
str = str + "\n";
}
}
var arr = str.split("\n");
Put your logic based on the JSON structure you are getting.根据您获得的 JSON 结构放置您的逻辑。
For your second and third problem -对于您的第二个和第三个问题-
I would like to extract just the ingredients and not the all text.
我想只提取成分而不是所有文本。
In some cases the images may have ingredients without specifying the ingredient key-word, how can I extract the ingredients in this case?
在某些情况下,图像可能有成分而没有指定成分关键字,在这种情况下如何提取成分?
Computer vision will ingest all the printed text from the image and give them to you as JSON, you can't extract particular texts.计算机视觉将从图像中提取所有打印文本并将它们作为 JSON 提供给您,您无法提取特定文本。 You can achieve the required result by using the same above mentioned approach and only extract the ingredient.
您可以通过使用上述相同的方法来达到所需的结果,并且只提取成分。
I would suggest to read this Extract printed text (OCR) using the Computer Vision REST API and Node.js GitHub document for more information.我建议使用计算机视觉 REST API 和 Node.js ZD3B7C913CD04EBFZEC0 文档阅读此提取打印文本 (OCR) 以获取更多信息。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.