Open CV OCR 改進了從具有背景的彩色圖像中提取數據

Question

我正在嘗試從手機屏幕截圖中提取一些信息。 雖然我的代碼能夠檢索一些信息，但不是全部。 我讀取了轉換為灰色的圖像，然后刪除了不需要的部分並應用了高斯閾值。 但是整個文本都沒有被閱讀。

import numpy as np
import cv2
from PIL import Image
import matplotlib.pyplot as plt
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r'C:\\Installs\\Tools\\Tesseract-OCR\\tesseract.exe'

image = "C:\\Workspace\\OCR\\tesseract\\rpstocks1 - Copy (2).png"
img = cv2.imread(image)
img_grey = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)

height, width, channels = img.shape
print (height, width, channels)


rec_img=cv2.rectangle(img_grey,(30,100),(1040,704),(0,255,0),3).copy()

crop_img = rec_img[105:1945, 35:1035].copy()
cv2.medianBlur(img,5)
cv2.imwrite("C:\\Workspace\\OCR\\tesseract\\Cropped_GREY.jpg",crop_img)

img_gauss = cv2.adaptiveThreshold(crop_img,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,cv2.THRESH_BINARY,11,12)
cv2.imwrite("C:\\Workspace\\OCR\\tesseract\\Cropped_Guass.jpg",img_gauss)
text = pytesseract.image_to_string(img_gauss, lang='eng')
text.encode('utf-8')
print(text)

Output

圖像尺寸 704 1080 3

投資

$9,712.99 
ASRT _ 0
500.46 shares  ......... ..  /0 
GNUS 
25169 Shares  """"" " ‘27.98%

rpstocks1 - 復制 (2).png Cropped_GREY.jpg Cropped_Guass.jpg

Answer 1

看看pytesseract的頁面分割模式，cf. 這個問答。 例如，使用config='-psm 12'已經為您提供了所有想要的文本。 然而，這些圖表也以某種方式被解釋為文本。

這就是為什么我會預處理圖像以獲取單個框（實際文本、圖表、頂部的那些信息等），並過濾以僅存儲具有感興趣內容的框。 這可以通過使用來完成

邊界矩形的y坐標（不在圖片的上5%，即手機狀態欄），
邊界矩形的寬度w （不超過圖像寬度的 50%，這些是水平線），
邊界矩形的x坐標（不在圖像的中間三分之一處，這些是圖形）。

剩下的就是使用config='-psm 6'在每個裁剪的圖像上運行pytesseract例如（假設一個統一的文本塊），並從任何換行符中清除文本。

那將是我的代碼：

import cv2
import pytesseract

# Read image
img = cv2.imread('cUcby.png')
hi, wi = img.shape[:2]

# Convert to grayscale for tesseraact
img_grey = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Mask single boxes by thresholding and morphological closing in x diretion
mask = cv2.threshold(img_grey, 248, 255, cv2.THRESH_BINARY_INV)[1]
mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE,
                        cv2.getStructuringElement(cv2.MORPH_RECT, (51, 1)))

# Find contours w.r.t. the OpenCV version
cnts = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]

# Get bounding rectangles
rects = [cv2.boundingRect(cnt) for cnt in cnts]

# Filter bounding rectangles:
# - not in the upper 5 % of the image (mobile phone status bar)
# - not wider than 50 % of the image' width (horizontal lines)
# - not being in the middle third of the image (graphs)
rects = [(x, y, w, h) for x, y, w, h in rects if
         (y > 0.05 * hi) and
         (w <= 0.5 * wi) and
         ((x < 0.3333 * wi) or (x > 0.6666 * wi))]

# Sort bounding rectangles first by y coordinate, then by x coordinate
rects = sorted(rects, key=lambda x: (x[1], x[0]))

# Get texts from bounding rectangles from pytesseract
texts = [pytesseract.image_to_string(
    img_grey[y-1:y+h+1, x-1:x+w+1], config='-psm 6') for x, y, w, h in rects]

# Remove line breaks
texts = [text.replace('\n', '') for text in texts]

# Output
print(texts)

而且，這就是 output：

['Investing', '$9,712.99', 'ASRT', '-27.64%', '500.46 shares', 'GNUS', '-27.98%', '251.69 shares']

由於您有邊界矩形的位置，您還可以使用該信息重新排列整個文本。

----------------------------------------
System information
----------------------------------------
Platform:      Windows-10-10.0.16299-SP0
Python:        3.9.1
PyCharm:       2021.1.1
OpenCV:        4.5.1
pytesseract:   4.00.00alpha
----------------------------------------

Open CV OCR 改進了從具有背景的彩色圖像中提取數據

問題描述

1 個解決方案

解決方案1
1 已采納 2021-04-28 10:22:43

Open CV OCR 改進了從具有背景的彩色圖像中提取數據

問題描述

1 個解決方案

解決方案1 1 已采納 2021-04-28 10:22:43

解決方案1
1 已采納 2021-04-28 10:22:43