Improving pytesseract correct text recognition from image

Question

I am trying to read captcha using pytesseract module. And it is giving accurate text most of the time, but not all the time.

This is code to read the image, manipulate the image and extract text from the image.

import cv2
import numpy as np
import pytesseract

def read_captcha():
    # opencv loads the image in BGR, convert it to RGB
    img = cv2.cvtColor(cv2.imread('captcha.png'), cv2.COLOR_BGR2RGB)

    lower_white = np.array([200, 200, 200], dtype=np.uint8)
    upper_white = np.array([255, 255, 255], dtype=np.uint8)

    mask = cv2.inRange(img, lower_white, upper_white)  # could also use threshold
    mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3)))  # "erase" the small white points in the resulting mask
    mask = cv2.bitwise_not(mask)  # invert mask

    # load background (could be an image too)
    bk = np.full(img.shape, 255, dtype=np.uint8)  # white bk

    # get masked foreground
    fg_masked = cv2.bitwise_and(img, img, mask=mask)

    # get masked background, mask must be inverted 
    mask = cv2.bitwise_not(mask)
    bk_masked = cv2.bitwise_and(bk, bk, mask=mask)

    # combine masked foreground and masked background 
    final = cv2.bitwise_or(fg_masked, bk_masked)
    mask = cv2.bitwise_not(mask)  # revert mask to original

    # resize the image
    img = cv2.resize(mask,(0,0),fx=3,fy=3)
    cv2.imwrite('ocr.png', img)

    text = pytesseract.image_to_string(cv2.imread('ocr.png'), lang='eng')

    return text

For manipulation of the image, I have got help from this stackoverflow post.

And this the original captcha image:

And this image is generated after the manipulation:

But, by using pytesseract , I am getting text: AX#7rL .

Can anyone guide me on how to improve the success rate to 100% here?

Answer 1

Since there are tiny holes in your resulting image, morphological transformations, specifically cv2.MORPH_CLOSE , to close the holes and smooth the image should work here

Threshold to obtain a binary image (black and white)

Perform morphological operations to close small holes in the foreground

Inverse the image to get result

4X#7rL

Potentially a cv2.GaussianBlur() before inserting into tesseract would help too

import cv2
import pytesseract

# Path for Windows
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

# Read in image as grayscale
image = cv2.imread('1.png',0)
# Threshold to obtain binary image
thresh = cv2.threshold(image, 220, 255, cv2.THRESH_BINARY)[1]

# Create custom kernel
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
# Perform closing (dilation followed by erosion)
close = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)

# Invert image to use for Tesseract
result = 255 - close
cv2.imshow('thresh', thresh)
cv2.imshow('close', close)
cv2.imshow('result', result)

# Throw image into tesseract
print(pytesseract.image_to_string(result))
cv2.waitKey()

Improving pytesseract correct text recognition from image

Question

1 answers

solution1
5 ACCPTED 2019-07-25 21:50:23

Improving pytesseract correct text recognition from image

Question

1 answers

solution1 5 ACCPTED 2019-07-25 21:50:23

solution1
5 ACCPTED 2019-07-25 21:50:23