Tesseract-据称容易将图像转换为错误的数字

Question

Please find below some images that tesseract recognized it incorrectly. 请在下面找到一些tesseract无法正确识别的图像。

47 is recognized as "4]". 47被识别为“ 4]”。

55 is recognized as "S55". 55被识别为“ S55”。

90 is recognized as "IQ". 90被识别为“ IQ”。

I thought the images are pretty good and should be easy to be recognized by Tesseract. 我认为这些图像非常好，应该容易被Tesseract识别。 But the results turn out to be wrong. 但是结果证明是错误的。 The code I used is shown below. 我使用的代码如下所示。

import cv2
import pytesseract
from PIL import Image
import glob

for i in glob.glob('*.png'):
    img = cv2.imread(i, 0)
    tessdata_dir_config = '--tessdata-dir "C:\Program Files (x86)\Tesseract-OCR\" --psm 10'
    result = pytesseract.image_to_string(Image.fromarray(img), config=tessdata_dir_config)
    print result

Does anyone know what is going on and how to improve the performance? 有谁知道发生了什么事以及如何提高性能？

Answer 1

Okay, I find an answer for my question. 好吧，我找到了我的问题的答案。 It seems that Tesseract doesn't like bold characters, so you have to erode the black part of the characters a little bit. Tesseract似乎不喜欢粗体字符，因此您必须稍微侵蚀字符的黑色部分。 But beware of that cv2.erode will erode white part of the characters, so we have to use cv2.dilate to achieve the objective. 但是请注意， cv2.erode将侵蚀字符的白色部分，因此我们必须使用cv2.dilate来达到目的。

for i in ['47-4].png', '55-S55.png', '90-IQ.png']:
    img = cv2.imread(i, 0)

    ### After apply dilation using 3X3 kernal. The recognition results are improved.##
    kernel = np.ones((3, 3), np.uint8)
    img = cv2.dilate(img, kernel, iterations=2)

    cv2.imwrite("./output/" + i[:-4]+'_dilate.png', img)
    tessdata_dir_config = '--tessdata-dir "D:\Program Files\Tesseract-ocr\" --psm 10'
    result = pytesseract.image_to_string(Image.fromarray(img), config=tessdata_dir_config)
    print result

I would like to see if there are any better analysis to this question. 我想看看这个问题是否有更好的分析。 So I would let it open for a while and choose the best answer. 因此，我将其打开一段时间并选择最佳答案。

Answer 2

I had the problem of reading text from android device screens. 我有从Android设备屏幕读取文本的问题。 On some devices it worked on others didn't. 在某些设备上，它无法在其他设备上运行。 I found in tesseract documentation that it has something to do with image dpi. 我在tesseract 文档中发现，它与图像dpi有关。

Tesseract works best on images which have a DPI of at least 300 dpi, so it may be beneficial to resize images. Tesseract在DPI至少为300 dpi的图像上效果最佳，因此调整图像尺寸可能会有所帮助。 For more information see the FAQ. 有关更多信息，请参见FAQ。

So I used resize function of cv2 to rescale the image. 因此，我使用了cv2的大小调整功能来重新缩放图像。

    path = "/home/share/workspace/NNW4JJ4T4LR4G66H_ZTE_Blade_L5/clock_present_cropped.png"
    path2 = "/home/share/workspace/NNW4JJ4T4LR4G66H_ZTE_Blade_L5/clock_present_cropped_2.png"
    crop_img2 = cv2.imread(str(path))
    img_scaled = cv2.resize(crop_img2, None, fx=0.5, fy=0.5, interpolation=cv2.INTER_LINEAR)
    cv2.imwrite(str(path2), img_scaled)
    crop_img2 = Image.open(path2)
    result = pytesseract.image_to_string(crop_img2)

Now it works well with all devices. 现在，它适用于所有设备。

Tesseract-据称容易将图像转换为错误的数字

问题描述

2 个解决方案

解决方案1
0 2017-09-12 06:48:17

解决方案2
0 2018-05-13 14:03:05

Tesseract-据称容易将图像转换为错误的数字

问题描述

2 个解决方案

解决方案1 0 2017-09-12 06:48:17

解决方案2 0 2018-05-13 14:03:05

解决方案1
0 2017-09-12 06:48:17

解决方案2
0 2018-05-13 14:03:05