在 Python 中使用 pytesseract 从图像中提取时间

Question

I would like to extract the time from some images using pytesseract in Python, but it doesn't produce anything.我想在 Python 中使用pytesseract从一些图像中提取时间，但它不会产生任何结果。

The code I was using is as follow:我使用的代码如下：

import pytesseract
from PIL import Image, ImageOps

im = Image.open(r'im.jpg')    
im_invert = ImageOps.invert(im)
text = pytesseract.image_to_string(im_invert)
print(text)

The original image:原图：

Image after inversion operation:反转操作后的图像：

When I ran the code above, the only thing I got is当我运行上面的代码时，我唯一得到的是

Is there anything wrong with my code?我的代码有什么问题吗？

Answer 1

If you can use EasyOCR , then this approach below works for your input image.如果您可以使用EasyOCR ，那么下面的这种方法适用于您的输入图像。

I have tested the given original image in google colab.我已经在 google colab 中测试了给定的原始图像。 For showing output images locally use cv2.imshow(...) and cv2.waitkey(0) .要在本地显示输出图像，请使用cv2.imshow(...)和cv2.waitkey(0) 。

Here, first median blur is applied to grayscale image.这里，第一中值模糊应用于灰度图像。 Next, thresholding, erosion and dilation is applied.接下来，应用阈值、腐蚀和膨胀。 Median Blur + Thresholding outputs almost similar confidence as Median Blur + Thresholding + Erosion + Dilation in this case.在这种情况下， Median Blur + Thresholding + Erosion + Dilation Median Blur + Thresholding输出与Median Blur + Thresholding + Erosion + Dilation几乎相似的置信度。

Image图片

OCR Prediction Including Confidence OCR 预测，包括置信度

Thresholding:
[([[3, 1], [270, 1], [270, 60], [3, 60]], '09:01:00', 0.797291100025177)]

Erosion:
[([[2, 2], [270, 2], [270, 58], [2, 58]], '09:01:00', 0.4145631492137909)]

Dilation:
[([[3, 1], [270, 1], [270, 60], [3, 60]], '09:01:00', 0.7948697805404663)]

Code代码

import cv2
import easyocr
import numpy as np
from PIL import Image
from google.colab.patches import cv2_imshow

# need to run only once to load model into memory
reader = easyocr.Reader(['ch_sim','en'])

img = cv2.imread('1.jpg', 0)
img = cv2.medianBlur(img, 5)

ret, th1 = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY)

#th1 = cv2.bitwise_not(th1)
kernel = np.ones((3,3), np.uint8)
erosion = cv2.erode(th1, kernel, iterations = 1)
dilation = cv2.dilate(erosion, kernel, iterations = 1)

print("Thresholding:\n")
cv2_imshow(th1)
print("\nErosion:\n")
cv2_imshow(erosion)
print("\nDilation:\n")
cv2_imshow(dilation)

print("Thresholding:")
result = reader.readtext(th1)
print(result)

print("Erosion:")
result = reader.readtext(erosion)
print(result)

print("Dilation:")
result = reader.readtext(dilation)
print(result)

在 Python 中使用 pytesseract 从图像中提取时间

问题描述

1 个解决方案

解决方案1
1 2020-11-18 19:05:26

Image图片

OCR Prediction Including Confidence OCR 预测，包括置信度

Code代码

在 Python 中使用 pytesseract 从图像中提取时间

问题描述

1 个解决方案

解决方案1 1 2020-11-18 19:05:26

Image图片

OCR Prediction Including Confidence OCR 预测，包括置信度

Code代码

解决方案1
1 2020-11-18 19:05:26