簡體   English   中英

如何檢測文本是旋轉 180 度還是倒置

[英]How to detect if text is rotated 180 degrees or flipped upside down

我正在做一個文本識別項目。 文本有可能旋轉 180 度。 我在終端上嘗試過 tesseract-ocr,但沒有運氣。 有沒有辦法檢測它並糾正它? 文本示例如下所示。

在此處輸入圖片說明

tesseract input.png output

tesseract input.png - --psm 0 -c min_characters_to_try=10

Warning. Invalid resolution 0 dpi. Using 70 instead.
Page number: 0
Orientation in degrees: 180
Rotate: 180
Orientation confidence: 0.74
Script: Latin
Script confidence: 1.67

檢測文本是否旋轉 180 度的一種簡單方法是使用文本傾向於向底部傾斜的觀察。 這是策略:

  • 將圖像轉換為灰度
  • 高斯模糊
  • 閾值圖像
  • 找到閾值圖像的上半部分/下半部分 ROI
  • 計算每一半的非零數組元素

閾值圖像

在此處輸入圖片說明

查找上半部分和下半部分的投資回報率

在此處輸入圖片說明

在此處輸入圖片說明

接下來我們拆分頂部/底部部分

在此處輸入圖片說明

對於每一半,我們使用cv2.countNonZero()計算非零數組元素。 我們得到這個

('top', 4035)
('bottom', 3389)

通過比較兩半之間的值,如果上半部分的像素比下半部分多,則上下顛倒 180 度。 如果它更少,則它的方向是正確的。

現在我們已經檢測到它是否顛倒了,我們可以使用這個函數旋轉它

def rotate(image, angle):
    # Obtain the dimensions of the image
    (height, width) = image.shape[:2]
    (cX, cY) = (width / 2, height / 2)

    # Grab the rotation components of the matrix
    matrix = cv2.getRotationMatrix2D((cX, cY), -angle, 1.0)
    cos = np.abs(matrix[0, 0])
    sin = np.abs(matrix[0, 1])

    # Find the new bounding dimensions of the image
    new_width = int((height * sin) + (width * cos))
    new_height = int((height * cos) + (width * sin))

    # Adjust the rotation matrix to take into account translation
    matrix[0, 2] += (new_width / 2) - cX
    matrix[1, 2] += (new_height / 2) - cY

    # Perform the actual rotation and return the image
    return cv2.warpAffine(image, matrix, (new_width, new_height))

旋轉圖像

rotated = rotate(original_image, 180)
cv2.imshow("rotated", rotated)

這給了我們正確的結果

在此處輸入圖片說明

如果圖像方向正確,這是像素結果

('top', 3209)
('bottom', 4206)

完整代碼

import numpy as np
import cv2

def rotate(image, angle):
    # Obtain the dimensions of the image
    (height, width) = image.shape[:2]
    (cX, cY) = (width / 2, height / 2)

    # Grab the rotation components of the matrix
    matrix = cv2.getRotationMatrix2D((cX, cY), -angle, 1.0)
    cos = np.abs(matrix[0, 0])
    sin = np.abs(matrix[0, 1])

    # Find the new bounding dimensions of the image
    new_width = int((height * sin) + (width * cos))
    new_height = int((height * cos) + (width * sin))

    # Adjust the rotation matrix to take into account translation
    matrix[0, 2] += (new_width / 2) - cX
    matrix[1, 2] += (new_height / 2) - cY

    # Perform the actual rotation and return the image
    return cv2.warpAffine(image, matrix, (new_width, new_height))

image = cv2.imread("1.PNG")
original_image = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (3,3), 0)
thresh = cv2.threshold(blurred, 110, 255, cv2.THRESH_BINARY_INV)[1]
cv2.imshow("thresh", thresh)

x, y, w, h = 0, 0, image.shape[1], image.shape[0]

top_half = ((x,y), (x+w, y+h/2))
bottom_half = ((x,y+h/2), (x+w, y+h))

top_x1,top_y1 = top_half[0]
top_x2,top_y2 = top_half[1]
bottom_x1,bottom_y1 = bottom_half[0]
bottom_x2,bottom_y2 = bottom_half[1]

# Split into top/bottom ROIs
top_image = thresh[top_y1:top_y2, top_x1:top_x2]
bottom_image = thresh[bottom_y1:bottom_y2, bottom_x1:bottom_x2]

cv2.imshow("top_image", top_image)
cv2.imshow("bottom_image", bottom_image)

# Count non-zero array elements
top_pixels = cv2.countNonZero(top_image)
bottom_pixels = cv2.countNonZero(bottom_image)

print('top', top_pixels)
print('bottom', bottom_pixels)

# Rotate if upside down
if top_pixels > bottom_pixels:
    rotated = rotate(original_image, 180)
    cv2.imshow("rotated", rotated)

cv2.waitKey(0)

我有點喜歡pytessaract解決方案。

import cv2 
import pytesseract
from scipy.ndimage import rotate as Rotate 

def float_convertor(x):
    if x.isdigit():
        out= float(x)
    else:
        out= x
    return out 

def tesseract_find_rotatation(img: str):
    img = cv2.imread(img) if isinstance(img, str) else img
    k = pytesseract.image_to_osd(img)
    out = {i.split(":")[0]: float_convertor(i.split(":")[-1].strip()) for i in k.rstrip().split("\n")}
    img_rotated = Rotate(img, 360-out["Rotate"])
    return img_rotated, out

用法

img_loc = ""
img_rotated, out = tessaract_find_rotation(img_loc)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM