简体   繁体   中英

Improving pytesseract correct text recognition from image

I am trying to read captcha using pytesseract module. And it is giving accurate text most of the time, but not all the time.

This is code to read the image, manipulate the image and extract text from the image.

import cv2
import numpy as np
import pytesseract

def read_captcha():
    # opencv loads the image in BGR, convert it to RGB
    img = cv2.cvtColor(cv2.imread('captcha.png'), cv2.COLOR_BGR2RGB)

    lower_white = np.array([200, 200, 200], dtype=np.uint8)
    upper_white = np.array([255, 255, 255], dtype=np.uint8)

    mask = cv2.inRange(img, lower_white, upper_white)  # could also use threshold
    mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3)))  # "erase" the small white points in the resulting mask
    mask = cv2.bitwise_not(mask)  # invert mask

    # load background (could be an image too)
    bk = np.full(img.shape, 255, dtype=np.uint8)  # white bk

    # get masked foreground
    fg_masked = cv2.bitwise_and(img, img, mask=mask)

    # get masked background, mask must be inverted 
    mask = cv2.bitwise_not(mask)
    bk_masked = cv2.bitwise_and(bk, bk, mask=mask)

    # combine masked foreground and masked background 
    final = cv2.bitwise_or(fg_masked, bk_masked)
    mask = cv2.bitwise_not(mask)  # revert mask to original

    # resize the image
    img = cv2.resize(mask,(0,0),fx=3,fy=3)
    cv2.imwrite('ocr.png', img)

    text = pytesseract.image_to_string(cv2.imread('ocr.png'), lang='eng')

    return text

For manipulation of the image, I have got help from this stackoverflow post.

And this the original captcha image:

在此处输入图片说明

And this image is generated after the manipulation:

在此处输入图片说明

But, by using pytesseract , I am getting text: AX#7rL .

Can anyone guide me on how to improve the success rate to 100% here?

Since there are tiny holes in your resulting image, morphological transformations, specifically cv2.MORPH_CLOSE , to close the holes and smooth the image should work here

Threshold to obtain a binary image (black and white)

在此处输入图片说明

Perform morphological operations to close small holes in the foreground

在此处输入图片说明

Inverse the image to get result

在此处输入图片说明

4X#7rL

Potentially a cv2.GaussianBlur() before inserting into tesseract would help too

import cv2
import pytesseract

# Path for Windows
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

# Read in image as grayscale
image = cv2.imread('1.png',0)
# Threshold to obtain binary image
thresh = cv2.threshold(image, 220, 255, cv2.THRESH_BINARY)[1]

# Create custom kernel
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
# Perform closing (dilation followed by erosion)
close = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)

# Invert image to use for Tesseract
result = 255 - close
cv2.imshow('thresh', thresh)
cv2.imshow('close', close)
cv2.imshow('result', result)

# Throw image into tesseract
print(pytesseract.image_to_string(result))
cv2.waitKey()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM