[英]Pytesseract doesn't recognize decimal points
我建議將 tesseract 每一行文本作為單獨的圖像傳遞。
出於某種原因,它似乎解決了小數點問題......
cv2.threshold
將圖像從灰度轉換為黑白。cv2.dilate
形態學操作(跨水平方向合並塊)。pytesseract
。這是代碼:
import numpy as np
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe' # I am using Windows
path_to_image = 'image.png'
img = cv2.imread(path_to_image, cv2.IMREAD_GRAYSCALE) # Read input image as Grayscale
# Convert to binary using automatic threshold (use cv2.THRESH_OTSU)
ret, thresh = cv2.threshold(img, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
# Dilate thresh for uniting text areas into blocks of rows.
dilated_thresh = cv2.dilate(thresh, np.ones((3,100)))
# Find contours on dilated_thresh
cnts = cv2.findContours(dilated_thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[-2] # Use index [-2] to be compatible to OpenCV 3 and 4
# Build a list of bounding boxes
bounding_boxes = [cv2.boundingRect(c) for c in cnts]
# Sort bounding boxes from "top to bottom"
bounding_boxes = sorted(bounding_boxes, key=lambda b: b[1])
# Iterate bounding boxes
for b in bounding_boxes:
x, y, w, h = b
if (h > 10) and (w > 10):
# Crop a slice, and inverse black and white (tesseract prefers black text).
slice = 255 - thresh[max(y-10, 0):min(y+h+10, thresh.shape[0]), max(x-10, 0):min(x+w+10, thresh.shape[1])]
text = pytesseract.image_to_string(slice, config="-c tessedit"
"_char_whitelist=abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890-:."
" --psm 3"
" ")
print(text)
我知道這不是最通用的解決方案,但它設法解決了您發布的示例。
請將答案視為概念解決方案 - 找到一個強大的解決方案可能非常具有挑戰性。
結果:
Output 文字:
7.3-8.2
Primo:50
您可以通過對圖像進行下采樣輕松識別。
如果您下采樣 0.5,結果將是:
現在,如果您閱讀:
7.3 - 8.2
Primo: 50
我通過使用 pytesseract 0.3.7 版本(當前)得到了結果
代碼:
# Load the libraries
import cv2
import pytesseract
# Load the image
img = cv2.imread("s9edQ.png")
# Convert to the gray-scale
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Down-sample
gry = cv2.resize(gry, (0, 0), fx=0.5, fy=0.5)
# OCR
txt = pytesseract.image_to_string(gry)
print(txt)
解釋:
輸入圖像包含一些人工制品。 您可以在圖像的右側看到它。 另一方面,當前圖像非常適合 OCR 識別。 當圖像中的數據不可見或損壞時,您需要使用預處理方法。 請閱讀以下內容:
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.