改进 pytesseract 从图像中正确识别文本

Question

I am trying to read captcha using pytesseract module.我正在尝试使用pytesseract模块读取验证码。 And it is giving accurate text most of the time, but not all the time.它大部分时间都提供准确的文本，但并非总是如此。

This is code to read the image, manipulate the image and extract text from the image.这是读取图像、操作图像和从图像中提取文本的代码。

import cv2
import numpy as np
import pytesseract

def read_captcha():
    # opencv loads the image in BGR, convert it to RGB
    img = cv2.cvtColor(cv2.imread('captcha.png'), cv2.COLOR_BGR2RGB)

    lower_white = np.array([200, 200, 200], dtype=np.uint8)
    upper_white = np.array([255, 255, 255], dtype=np.uint8)

    mask = cv2.inRange(img, lower_white, upper_white)  # could also use threshold
    mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3)))  # "erase" the small white points in the resulting mask
    mask = cv2.bitwise_not(mask)  # invert mask

    # load background (could be an image too)
    bk = np.full(img.shape, 255, dtype=np.uint8)  # white bk

    # get masked foreground
    fg_masked = cv2.bitwise_and(img, img, mask=mask)

    # get masked background, mask must be inverted 
    mask = cv2.bitwise_not(mask)
    bk_masked = cv2.bitwise_and(bk, bk, mask=mask)

    # combine masked foreground and masked background 
    final = cv2.bitwise_or(fg_masked, bk_masked)
    mask = cv2.bitwise_not(mask)  # revert mask to original

    # resize the image
    img = cv2.resize(mask,(0,0),fx=3,fy=3)
    cv2.imwrite('ocr.png', img)

    text = pytesseract.image_to_string(cv2.imread('ocr.png'), lang='eng')

    return text

For manipulation of the image, I have got help from this stackoverflow post.为了处理图像，我从这个stackoverflow帖子中得到了帮助。

And this the original captcha image:这是原始的验证码图像：

And this image is generated after the manipulation:这个图像是在操作后生成的：

But, by using pytesseract , I am getting text: AX#7rL .但是，通过使用pytesseract ，我收到了文本： AX#7rL 。

Can anyone guide me on how to improve the success rate to 100% here?任何人都可以指导我如何将成功率提高到 100% 吗？

Answer 1

Since there are tiny holes in your resulting image, morphological transformations, specifically cv2.MORPH_CLOSE , to close the holes and smooth the image should work here由于生成的图像中存在小孔，因此形态变换，特别是cv2.MORPH_CLOSE ，用于关闭孔并平滑图像应该在这里起作用

Threshold to obtain a binary image (black and white)获取二值图像（黑白）的阈值

Perform morphological operations to close small holes in the foreground执行形态学操作以关闭前景中的小孔

Inverse the image to get result反转图像以获得结果

4X#7rL 4X#7rL

Potentially a cv2.GaussianBlur() before inserting into tesseract would help too在插入tesseract之前可能有一个cv2.GaussianBlur()也会有帮助

import cv2
import pytesseract

# Path for Windows
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

# Read in image as grayscale
image = cv2.imread('1.png',0)
# Threshold to obtain binary image
thresh = cv2.threshold(image, 220, 255, cv2.THRESH_BINARY)[1]

# Create custom kernel
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
# Perform closing (dilation followed by erosion)
close = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)

# Invert image to use for Tesseract
result = 255 - close
cv2.imshow('thresh', thresh)
cv2.imshow('close', close)
cv2.imshow('result', result)

# Throw image into tesseract
print(pytesseract.image_to_string(result))
cv2.waitKey()

改进 pytesseract 从图像中正确识别文本

问题描述

1 个解决方案

解决方案1
5 已采纳 2019-07-25 21:50:23

改进 pytesseract 从图像中正确识别文本

问题描述

1 个解决方案

解决方案1 5 已采纳 2019-07-25 21:50:23

解决方案1
5 已采纳 2019-07-25 21:50:23