简体   繁体   中英

pytesseract improving OCR accuracy for blurred numbers on an image

Example of numbers

数字示例

I am using the standard pytesseract img to text. I have tried with digits only option 90% of the time it is perfect but above is a example where it goes horribly wrong! This example produced no characters at all

As you can see there are now letters so language option is of no use, I did try adding some text in the grabbed image but it still goes wrong.

I increased the contrast using CV2 the text has been blurred upstream of my capture

Any ideas on increasing accuracy?

After many tests using the suggestions below. I found the sharpness filter gave unreliable results. another tool you can use is contrast=cv2.convertScaleAbs(img2,alpha=2.5,beta=-200) I used this as my text in black and white ended up light gray text on a gray background with convertScaleAbs I was able to increase the contrast to get almost a black and white image

Basic steps for OCR

  1. Convert to monochrome
  2. Crop image to your target text
  3. Filter image to get black and white
  4. perform OCR

Here's a simple approach using OpenCV and Pytesseract OCR. To perform OCR on an image, it's important to preprocess the image. The idea is to obtain a processed image where the text to extract is in black with the background in white. To do this, we can convert to grayscale , then apply a sharpening kernel using cv2.filter2D() to enhance the blurred sections. A general sharpening kernel looks like this:

[[-1,-1,-1], [-1,9,-1], [-1,-1,-1]]

Other kernel variations can be found here . Depending on the image, you can adjust the strength of the filter. From here we Otsu's threshold to obtain a binary image then perform text extraction using the --psm 6 configuration option to assume a single uniform block of text. Take a look here for more OCR configuration options.


Here's a visualization of the image processing pipeline:

Input image

在此处输入图像描述

Convert to grayscale -> apply sharpening filter

在此处输入图像描述

Otsu's threshold

在此处输入图像描述

Result from Pytesseract OCR

124,685

Code

import cv2
import numpy as np
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

# Load image, grayscale, apply sharpening filter, Otsu's threshold 
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
sharpen_kernel = np.array([[-1,-1,-1], [-1,9,-1], [-1,-1,-1]])
sharpen = cv2.filter2D(gray, -1, sharpen_kernel)
thresh = cv2.threshold(sharpen, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# OCR
data = pytesseract.image_to_string(thresh, lang='eng', config='--psm 6')
print(data)

cv2.imshow('sharpen', sharpen)
cv2.imshow('thresh', thresh)
cv2.waitKey()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM