I'm trying to OCR these digits. However, tesseract is not recognizing them properly.
import cv2 from pytesseract import image_to_string
image = cv2.imread('PATHTOIMAGE', cv2.IMREAD_COLOR)
image = cv2.resize(image, None, fx=5, fy=5, interpolation=cv2.INTER_CUBIC)
gaussian = cv2.GaussianBlur(image, (5, 5), 2)
mask = cv2.inRange(gaussian, (250, 250, 250), (255, 255, 255))
ocr = image_to_string(mask, config='-c tessedit_char_whitelist=0123456789')
print(ocr)
The masking result is the following:
OCR result: 88311
I tried performing some morphological operations from here (dilating and opening), but no luck.
I also tried to detect contours and detect digit by digit, but also no luck.
How else could I improve?
I was able to achieve correct results without the scaling and Gaussian blur steps. I also inverted the mask and used only --psm 6
image = cv2.imread('PATHTOIMAGE', cv2.IMREAD_COLOR)
mask = cv2.inRange(image, (250, 250, 250), (255, 255, 255))
ocr = image_to_string(~mask, config='--psm 6')
print(ocr)
For what it's worth, I cannot recreate the 2 vs 8 confusion error. I am running tesseract 4.1.1 and pytesseract 0.3.8 on windows.
If you still need to explore other preprocessing steps, consider running a sequence of erosion and dilation operations with asymmetric kernels. For example
kernel = np.ones((6,2), np.uint8)
img = cv2.erode(mask, kernel, 1)
kernel = np.ones((2,40), np.uint8)
img = cv2.dilate(img, kernel, 1)
kernel = np.ones((1,40), np.uint8)
img = cv2.erode(img, kernel, 1)
kernel = np.ones((4,2), np.uint8)
img = cv2.dilate(img, kernel, 1)
Numbers are for your scaled image. This appears to remove the vertical points on the 2 while retaining other relevant information. You will need to tune and ensure it works for all number options.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.