简体   繁体   English

Azure 计算机视觉:识别印刷文本

[英]Azure Computer Vision : Recognize Printed Text

I'm using Azure computer vision with nodejs, and I would to extract text on the images, it works as expected but I'm facing some challenges: the code:我正在使用带有nodejs的Azure计算机视觉,我会提取图像上的文本,它按预期工作,但我面临一些挑战:代码:

'use strict';



const request = require('request');




const subscriptionKey = 'key';

const endpoint = 'endpoint'



var uriBase = endpoint + 'vision/v3.1/ocr';



const imageUrl = 'https://livesimply.me/wp-content/uploads/2015/09/foods-to-avoid-real-food-3036-2-1024x683.jpg';



// Request parameters.

const params = {

'language': 'unk',

'detectOrientation': 'true',

};



const options = {

uri: uriBase,

qs: params,

body: '{"url": ' + '"' + imageUrl + '"}',

headers: {

    'Content-Type': 'application/json',

    'Ocp-Apim-Subscription-Key' : subscriptionKey

}

};



request.post(options, (error, response, body) => {

if (error) {

console.log('Error: ', error);

return;

}

let jsonResponse = JSON.stringify(JSON.parse(body), null, '  ');

console.log('JSON Response\n');

console.log(jsonResponse);

});

the output: output:

"regions": [

{

  "boundingBox": "0,191,277,281",

  "lines": [

    {

      "boundingBox": "53,191,23,49",

      "words": [

        {

          "boundingBox": "53,191,23,49",

          "text": "in"

        }

      ]

    },

    {

      "boundingBox": "0,285,277,82",

      "words": [

        {

          "boundingBox": "0,285,150,82",

          "text": ")arb.0g"

        },

        {

          "boundingBox": "214,288,63,63",

          "text": "0%"

        }

      ]

    },

    {

      "boundingBox": "14,393,45,79",

      "words": [

        {

          "boundingBox": "14,393,45,79",

          "text": "Og"

        }

      ]

    },

    {

      "boundingBox": "213,394,63,63",

      "words": [

        {

          "boundingBox": "213,394,63,63",

          "text": "00/0"

        }

      ]

    }

  ]

},

{

  "boundingBox": "322,184,352,457",

  "lines": [

    {

      "boundingBox": "326,184,348,54",

      "words": [

        {

          "boundingBox": "326,184,239,52",

          "text": "INGREDIENTS:"

        },

        {

          "boundingBox": "588,188,86,50",

          "text": "WHITE"

        }

      ]

    },

    {

      "boundingBox": "325,248,281,59",

      "words": [

        {

          "boundingBox": "325,248,83,56",

          "text": "TUNA,"

        },

        {

          "boundingBox": "417,250,127,51",

          "text": "SOYBEAN"

        },

        {

          "boundingBox": "555,252,51,55",

          "text": "OIL,"

        }

      ]

    },

    {

      "boundingBox": "324,313,341,60",

      "words": [

        {

          "boundingBox": "324,313,155,52",

          "text": "VEGETABLE"

        },

        {

          "boundingBox": "489,316,101,56",

          "text": "BROTH,"

        },

        {

          "boundingBox": "598,317,67,56",

          "text": "SALT,"

        }

      ]

    },

    {

      "boundingBox": "324,378,334,53",

      "words": [

        {

          "boundingBox": "324,378,235,52",

          "text": "PYROPHOSPHATE"

        },

        {

          "boundingBox": "566,381,92,50",

          "text": "ADDED"

        }

      ]

    },

    {

      "boundingBox": "323,519,248,52",

      "words": [

        {

          "boundingBox": "323,519,193,51",

          "text": "DISTRIBUTED"

        },

        {

          "boundingBox": "528,521,43,50",

          "text": "BY:"

        }

      ]

    },

    {

      "boundingBox": "322,584,298,57",

      "words": [

        {

          "boundingBox": "322,584,124,50",

          "text": "BUMBLE"

        },

        {

          "boundingBox": "457,585,52,50",

          "text": "BEE"

        },

        {

          "boundingBox": "519,585,101,56",

          "text": "FOODS,"

        }

      ]

    }

  ]

},

{

  "boundingBox": "791,400,198,117",

  "lines": [

    {

      "boundingBox": "921,400,68,45",

      "words": [

        {

          "boundingBox": "921,400,68,45",

          "text": ",11."

        }

      ]

    },

    {

      "boundingBox": "791,464,128,53",

      "words": [

        {

          "boundingBox": "791,464,75,53",

          "text": "PRC:"

        },

        {

          "boundingBox": "874,467,45,48",

          "text": "x"

        }

      ]

    }

  ]

}

  ]

  }

but I'm facing some challenges with this code:但我在这段代码中面临一些挑战:

  1. I want the output as a string and not JSON tree.我想要 output 作为字符串而不是 JSON 树。
  2. I would like to extract just the ingredients and not the all text.我想只提取成分而不是所有文本。
  3. in some cases the images may have ingredients without specifying the ingredient key-word, how can I extract the ingredients in this case?在某些情况下,图像可能包含成分而未指定成分关键字,在这种情况下如何提取成分?

image:图片: 在此处输入图像描述

Thanks for you help experts.谢谢各位高手的帮助。

There is a brand new online portal provided by Microsoft to test this service, among others and input requirements for Read API. Microsoft 提供了一个全新的在线门户来测试此服务以及读取 API 的输入要求。 The output provided as string and JSON. output 作为字符串和 JSON 提供。

Link: https://preview.vision.azure.com/demo/OCR链接: https://preview.vision.azure.com/demo/OCR

if you need to extract key value pairs you can train a custom model using Form Recognizer.如果您需要提取键值对,您可以使用表单识别器训练自定义 model。

• Custom – Extract key value pairs trained on your own documents • 自定义——提取在您自己的文档上训练的键值对

Here is link to custom model using Form Recognizer.这是使用表单识别器自定义 model 的链接

We extract printed text with optical character recognition (OCR) from an image using the Computer Vision REST API.我们使用计算机视觉 REST API 从图像中提取带有光学字符识别 (OCR) 的印刷文本。 And a successful response is returned in JSON.并在 JSON 中返回成功响应。 You can't get a direct string output form this Azure Cognitive Service.您无法从此 Azure 认知服务中获得直接字符串 output。

For the problem -对于问题 -

I want the output as a string and not JSON tree.我想要 output 作为字符串而不是 JSON 树。

We can't directly print the ingredients like a string as seen in the image.我们不能像图像中看到的那样直接打印像字符串一样的成分。 To extract the content and display it in particular format, after you get the JSON string, parse that into a JSON object and run a loop to extract data from it.要提取内容并以特定格式显示,在获得 JSON 字符串后,将其解析为 JSON object 并运行循环以从中提取数据。 After that use the split function to get the data stored into arrays .之后使用拆分 function 获取存储到 arrays 的数据 As shown in the below snippet.如以下代码段所示。

function(error, response, body){
    if(error) {
        console.log(error);
    } else {
        //parsing the JSON string
        var jsonObj = JSON.parse(body);

        var ob = jsonObj;
        //running loop to extract the text values
            for(i=0;i<....){
                for(j=0;j<....){
                    for(k=0;k<....){
                         var str = str + " "+ob.....text;
                    }
                    str = str + "\n";
                }
            }
          var arr = str.split("\n");

Put your logic based on the JSON structure you are getting.根据您获得的 JSON 结构放置您的逻辑。

For your second and third problem -对于您的第二个和第三个问题-

I would like to extract just the ingredients and not the all text.我想只提取成分而不是所有文本。

In some cases the images may have ingredients without specifying the ingredient key-word, how can I extract the ingredients in this case?在某些情况下,图像可能有成分而没有指定成分关键字,在这种情况下如何提取成分?

Computer vision will ingest all the printed text from the image and give them to you as JSON, you can't extract particular texts.计算机视觉将从图像中提取所有打印文本并将它们作为 JSON 提供给您,您无法提取特定文本。 You can achieve the required result by using the same above mentioned approach and only extract the ingredient.您可以通过使用上述相同的方法来达到所需的结果,并且只提取成分。

I would suggest to read this Extract printed text (OCR) using the Computer Vision REST API and Node.js GitHub document for more information.我建议使用计算机视觉 REST API 和 Node.js ZD3B7C913CD04EBFZEC0 文档阅读此提取打印文本 (OCR) 以获取更多信息。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 允许计算机按照字母出现顺序识别字母的文本框 - text box that allows computer to recognize letter in the order they appear 如何使用Google存储中的图像在Microsoft Azure的计算机视觉中修复InvalidImageFormat? - How to fix InvalidImageFormat in Microsoft Azure's Computer Vision using image from google storage? 计算机视觉 API 批量读取文件以从多个 pdf 和图像中提取文本 - Computer Vision API Batch Read File to extract text from multiple pdf and images 调用计算机视觉 OCR 时出现 415 错误 - Get 415 error while calling computer vision OCR 在JavaScript中使用Computer Vision Thumbnails API响应数据 - Working with Computer Vision Thumbnails API repsonse data in JavaScript javascript的计算机视觉API无法正常工作[初学者的错误] - Computer Vision API for javascript not working[Beginner's error] 认知服务计算机视觉API返回“缓存控制错误” - Cognitive Service Computer vision API returning ' cache control error' 无法使用 Microsoft 的计算机视觉获得图像的主色 - Unable to get dominant colour of image using Microsoft's Computer Vision 除文本外正在打印链接 - Links are being printed besides the text 在javascript / jQuery中的textarea中显示新打印的文本 - Show newly printed text in textarea in javascript/jQuery
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM