簡體   English   中英

Pytesseract OCR 無法識別數字

[英]Pytesseract OCR doesn't recognize the digits

我正在嘗試閱讀這些圖片:

在此處輸入圖像描述 在此處輸入圖像描述 在此處輸入圖像描述

我嘗試了幾個選項,但我似乎無法將它們正確讀取為 15/0、30/0、40/0。

    frame = frame[900:1000, 450:500]
    scale_percent = 200  # percent of original size
    width = int(frame.shape[1] * scale_percent / 100)
    height = int(frame.shape[0] * scale_percent / 100)
    dim = (width, height)
    frame = cv2.resize(frame, dim, interpolation=cv2.INTER_AREA)
    cv2.imshow("cropped", frame)
    cv2.waitKey(0)
    frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
    cv2.imshow("cropped", frame)
    cv2.waitKey(0)

    pytesseract.pytesseract.tesseract_cmd = (
        r"C:\Program Files\Tesseract-OCR\tesseract.exe"
    )
    results = pytesseract.image_to_data(
        frame,
        output_type=Output.DICT,
        config="--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789",
    )
    # results = replace_chars(results)
    print(("-").join(results["text"]), "\n")

一種解決方法是使用inRange 閾值

結果將是:

在此處輸入圖像描述 在此處輸入圖像描述 在此處輸入圖像描述

如果你設置page-segmentation-mode 6

15
0

30
0

40
0

代碼:

import cv2
import pytesseract
from numpy import array

image_list = ["LZxCs.png", "W06I0.png", "vvzE5.png"]

for image in image_list:
    bgr_image = cv2.imread(image)
    hsv_image = cv2.cvtColor(bgr_image, cv2.COLOR_BGR2HSV)
    mask = cv2.inRange(hsv_image, array([0, 0, 0]), array([165, 10, 255]))
    kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5, 3))
    dilate = cv2.dilate(mask, kernel, iterations=1)
    thresh = cv2.bitwise_and(dilate, mask)
    text = pytesseract.image_to_string(thresh, config='--psm 6')
    print(text)

第二種方法是應用全局閾值

在此處輸入圖像描述 在此處輸入圖像描述 在此處輸入圖像描述

如果你設置page-segmentation-mode 6

15
0

30
0

40
0

代碼:

import cv2
import pytesseract

image_list = ["LZxCs.png", "W06I0.png", "vvzE5.png"]

for image in image_list:
    bgr_image = cv2.imread(image)
    gray_image = cv2.cvtColor(bgr_image, cv2.COLOR_BGR2GRAY)
    thresh = cv2.threshold(gray_image, 127, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
    text = pytesseract.image_to_string(thresh, config='--psm 6')
    print(text)
    cv2.imwrite(f"/Users/ahx/Desktop/{image}", thresh)
    cv2.imshow('', thresh)
    cv2.waitKey(0)

有關更多信息,您可以查看文檔

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM