Pytesseract：無法讀取質量相對較好的圖像上的數字

Question

我有如下圖像：

使用 tesseract 命令：

pytesseract.image_to_string(box_img_6, config="--lang= 'eng' --psm 6 --oem 3")

我得到 output: 'nu'

我認為 tesseract 應該在這張圖片上表現更好，並且至少可以讀取一些數字。

你能幫我提高 Tesseract 的性能嗎？

謝謝你。

Answer 1

試試這個代碼

import pytesseract
from PIL import Image
pytesseract.pytesseract.tesseract_cmd = (r"C:\Tesseract-OCR\tesseract.exe")
text = pytesseract.image_to_string(Image.open(r"a.jpg"), lang='eng',
                        config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789')

print(text)

Output

**3008**

Answer 2

您應該閱讀提高 output 的質量：

但是對於輸入圖像，您不需要應用任何預處理或設置任何配置參數，結果：

txt = pytesseract.image_to_string(gray_image)

將會：

在當前最新版本的 pytesseract ( 0.3.7 )

代碼：

import cv2
import pytesseract

# Load the image
img = cv2.imread("wwckp.jpg")

# Convert to the gray-scale
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# OCR
txt = pytesseract.image_to_string(gry)
print(txt)

Pytesseract：無法讀取質量相對較好的圖像上的數字

問題描述

2 個解決方案

解決方案1
1 已采納 2021-03-16 13:56:45

解決方案2
0 2021-03-17 07:39:20

Pytesseract：無法讀取質量相對較好的圖像上的數字

問題描述

2 個解決方案

解決方案1 1 已采納 2021-03-16 13:56:45

解決方案2 0 2021-03-17 07:39:20

解決方案1
1 已采納 2021-03-16 13:56:45

解決方案2
0 2021-03-17 07:39:20