简体   繁体   English

识别具有相同颜色的图像上不可见的数字

[英]Recognize poorly visible digits on an image with same color

I'm trying to extract digits from an image using Tesseract/PaddleOCR to recognize text from cropped images.我正在尝试使用 Tesseract/PaddleOCR 从图像中提取数字以识别裁剪图像中的文本。 I'm using OpenCV to preprocess the image for better recognition.我正在使用 OpenCV 对图像进行预处理以更好地识别。 I tried applying a Gaussian blur and a Threshold method for binarization, but the result is pretty bad.我尝试应用高斯模糊和阈值方法进行二值化,但结果非常糟糕。

Here is the code for reading an image and converting to grayscale, which is better but its still poor and cannot extract text from this image:这是读取图像并转换为灰度的代码,它更好但仍然很差并且无法从该图像中提取文本:

def display(img,cmap='gray'):
    fig = plt.figure(figsize=(12,10))
    ax = fig.add_subplot(111)
    ax.imshow(img,cmap='gray')
img = cv2.imread("/content/PXL_20211019_171419721.MP.jpg")
plt.imshow(img)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
display(gray)

Here is the image I am using cropped image这是我使用裁剪图像的图像

IMO: you can not get a good result from bad input. IMO:您无法从错误的输入中获得好的结果。 Focus on getting a better input image or you will need "human OCR".专注于获得更好的输入图像,否则您将需要“人工 OCR”。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM