无法使用 python OCR pytesseract 从图像中提取文本

Question

我正在尝试从一些图像中提取文本。 它适用于数百个其他图像，但在某些情况下它找不到任何文本。 为了优化提取阶段的图像，所有图像都转换为黑白图像。 他们所有的背景都是白色的，而其他的则是黑色的，例如图标、文本等。

例如，它适用于下图并成功在图像中找到“睡眠定时器”文本。 我不确定它是否相关，但下面带有“睡眠定时器”文本的图片大小为 320 × 351

但是对于下面的图像，它根本找不到任何文本。 这张图片的尺寸是 161 × 320。

由于找不到原因，我尝试调整图像大小，但没有成功。

这是我的代码：

from pytesseract import Output
import pytesseract
import cv2

image = cv2.imread('imagePath')

rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
results = pytesseract.image_to_data(rgb, output_type=Output.DICT)

for i in range(0, len(results["text"])):

    text = results["text"][i]
    conf = int(results["conf"][i])

    print("Confidence: {}".format(conf))
    print("Text: {}".format(text))
    print("")

Answer 1

它对我有用，我测试过：

import pytesseract

print(pytesseract.image_to_string('../images/grmgrm.jfif'))
results = pytesseract.image_to_data('../images/grmgrm.jfif', output_type=pytesseract.Output.DICT)
print(results)

你有错误吗？ 向我们展示您遇到的错误。

无法使用 python OCR pytesseract 从图像中提取文本

问题描述

1 个解决方案

解决方案1
0 2021-04-11 21:08:58

无法使用 python OCR pytesseract 从图像中提取文本

问题描述

1 个解决方案

解决方案1 0 2021-04-11 21:08:58

解决方案1
0 2021-04-11 21:08:58