简体   繁体   中英

I Don't get the expected result when calling ocr api from computer visio

I'm trying to use the ocr method from computer visio to extract all the text from a specific image. Nevertheless it doesn't return the info I know which is there, because when I analize the image directly in the available option in this page https://azure.microsoft.com/es-es/services/cognitive-services/computer-vision/ , it does return the data.

This is the image im traying to get the data from https://bitbucket.org/miguel_acevedo_ve/python-stream/raw/086279ad6885a490e521785ba288914ed98cfd1d/test.jpg

I have followed all the python tutorial available in the azure documentation site.

import matplotlib.pyplot as plt
from matplotlib.patches import Rectangle
from PIL import Image
from io import BytesIO

subscription_key = "<Subscription Key>"

assert subscription_key

vision_base_url = "https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/"

ocr_url = vision_base_url + "ocr"

image_url = "https://bitbucket.org/miguel_acevedo_ve/python-stream/raw/086279ad6885a490e521785ba288914ed98cfd1d/test.jpg"

'''image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/a/af/" + \
    "Atomist_quote_from_Democritus.png/338px-Atomist_quote_from_Democritus.png"
'''

headers = {'Ocp-Apim-Subscription-Key': subscription_key}
params  = {'mode' : 'Printed'}
data    = {'url': image_url}
response = requests.post(ocr_url, headers=headers, params=params, json=data)
response.raise_for_status()

analysis = response.json()
print(analysis)

and this is my current output:

{u'regions': [], u'textAngle': 0.0, u'orientation': u'NotDetected', u'language': u'unk'}

UPDATE: The solution is to use recognizeText not the ocr function from computer visio.

I see there are two images in your code.

The one in your comment block is this below. This is a good sample similarly like the famous dataset MNIST for handwritten. The trait of this class dataset is that there is not any strong noisy pixel.

在此处输入图片说明

However, the other one below, there are strong noisy pixels all over the image, even I think over 99%.

在此处输入图片说明

So they are two scenarios. The OCR performance of Azure Cognitive Service is depended on the sample dataset in trainning model. So actually OCR in computer vision just can detect these similar images with trained samples.

The correct way for the second image is first to detect the enough small area of pixels include text content, and then cut it to do the ocr calling. For example, if ocr a license number from an image of a car head, it only requires the image part of car plate.

The solution is to use the recognizeText method and not the ocr method from computer vision.

First you need to send a post and then with the operationid make a get request to obtain de results.

vision_base_url = "https://westeurope.api.cognitive.microsoft.com/vision/v2.0/"

ocr_url = vision_base_url + "recognizeText"

response = requests.post(
    ocr_url, headers=headers,params=params, data=imgByteArr)
operationLocation = response.headers['Operation-Location']
response = requests.request('GET', operationLocation, json=None, data=None, headers=headers, params=None)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM