Python中的Pytesseract圖像到文本問題

Question

請檢查以下圖片：

我正在使用以下代碼從圖像中提取文本。

img = cv2.imread("img.png")
txt = pytesseract.image_to_string(img)

但結果顯示與原始結果不同：

它顯示以下結果：

+BuFl

但它應該是：

+Bu#L

我不知道問題是什么。 我是 Pytesseract 的新手。

有沒有人可以幫我解決問題？

非常感謝。

Answer 1

一種解決方法是應用otsu-thresholding

與全局閾值不同，Otsu 的方法會自動找到閾值。

應用 Otsu 閾值的結果將是：

import cv2
import pytesseract


img = cv2.imread("Tqom8.png")  # Load the image
img = cv2.resize(img, (0, 0), fx=0.5, fy=0.5)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)  # Convert to gray
thr = cv2.threshold(gray, 0, 128, cv2.THRESH_OTSU)[1]
txt = pytesseract.image_to_string(gray, config='--psm 6')
print(pytesseract.__version__)
print(txt)

結果：

0.3.8
+Bu#L

另請務必閱讀提高 output 的質量

Python中的Pytesseract圖像到文本問題

問題描述

1 個解決方案

解決方案1
1 2022-01-08 20:57:29

Python中的Pytesseract圖像到文本問題

問題描述

1 個解決方案

解決方案1 1 2022-01-08 20:57:29

解決方案1
1 2022-01-08 20:57:29