在一系列符號上使用 Python Tesseract 進行光學字符識別

Question

你好，我正在尋找指導。 我一直在使用 pytesseract 進行 OCR，但似乎我無法讓 OCR 識別圖像中放在一起的一系列等號。 關於如何解決這個問題的任何指導？ 我用 AWS Rekognition、Google Vision 測試了圖像，結果相同。 我嘗試使用 Open CV 獲取 select ROI，並將 OCR 聚焦於此，但結果仍然是空的，即無法識別任何字符。 感謝您的指導。

謝謝你

Answer 1

您的文字似乎很難提取。 使用 tesseract 提取文本時嘗試處理完整圖像。 我對您的解決方案提出了一種方法，但正如您所見，字符的邊界框不是預期的。 這是代碼：

import cv2
import numpy as np
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract'
originalImage = cv2.imread('a.png')
grayImage = cv2.cvtColor(originalImage, cv2.COLOR_BGR2GRAY)
(thresh, blackAndWhiteImageOriginal) = cv2.threshold(grayImage, 127, 255, cv2.THRESH_BINARY_INV)
blackAndWhiteImage = cv2.dilate(blackAndWhiteImageOriginal, np.ones((3,3), np.uint8))
ocr_output_details = pytesseract.image_to_data(blackAndWhiteImage, output_type=pytesseract.Output.DICT, config="--psm 7 -c tessedit_char_whitelist==")
rgbImage = cv2.cvtColor(blackAndWhiteImage,cv2.COLOR_GRAY2RGB)
for i in range(len(ocr_output_details['level'])):
    (x, y, w, h) = (ocr_output_details['left'][i], ocr_output_details['top'][i], ocr_output_details['width'][i], ocr_output_details['height'][i])
    cv2.rectangle(rgbImage, (x, y), (x + w, y + h), (0,0,255), 2)

print('Text: ', ocr_output_details['text'])
cv2.imshow('Boxes', rgbImage)
cv2.waitKey(0)
cv2.destroyAllWindows()

結果：結果 1

使用另一個具有預期字符大小的合適的完整圖像，我可以用 tesseract 完美地提取相等的符號。 這是代碼：

import cv2
import numpy as np
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract'
originalImage = cv2.imread('b.jpg')
grayImage = cv2.cvtColor(originalImage, cv2.COLOR_BGR2GRAY)
(thresh, blackAndWhiteImageOriginal) = cv2.threshold(grayImage, 127, 255, cv2.THRESH_BINARY)
blackAndWhiteImage = cv2.erode(blackAndWhiteImageOriginal, np.ones((3,3), np.uint8))
img = originalImage
img_copy = img.copy()
gray = cv2.cvtColor(img_copy, cv2.COLOR_BGR2GRAY)
_, thresh = cv2.threshold(gray, 180, 255, cv2.THRESH_BINARY)
results = pytesseract.image_to_data(thresh, config="-c tessedit_char_whitelist== --psm 6")
text = []
for b in map(str.split, results.splitlines()[1:]):
    if len(b) == 12:
        x, y, w, h = map(int, b[6: 10])
        cv2.rectangle(originalImage, (x, y), (x + w, y + h), (255,0,0), 2)
        cv2.putText(originalImage, b[11], (x, y + h + 15), cv2.FONT_HERSHEY_COMPLEX, 0.6, 0)
        text.append(b[11])

print('Text: ', text)
cv2.imshow("Result", originalImage)
cv2.waitKey(0)

結果：結果 2

您可以嘗試使用 Tesseract 文檔改進結果。 Tesseract - 提高質量 output

重要的事情是：

使用白色作為背景，使用黑色作為字符字體顏色
Select 所需的 tesseractpsm 模式。 在之前的案例中，我分別使用 6 和 7 psm 模式將圖像視為單個統一的文本塊並將圖像視為單個文本行
嘗試使用 tessedit_char_whitelist 配置僅指定您要搜索的字符。

在一系列符號上使用 Python Tesseract 進行光學字符識別

問題描述

1 個解決方案

解決方案1
0 2023-01-30 11:41:15

在一系列符號上使用 Python Tesseract 進行光學字符識別

問題描述

1 個解決方案

解決方案1 0 2023-01-30 11:41:15

解決方案1
0 2023-01-30 11:41:15