How to detect if text is rotated 180 degrees or flipped upside down

Question

I am working on a text recognition project. There is a chance the text is rotated 180 degrees. I have tried tesseract-ocr on terminal, but no luck. Is there any way to detect it and correct it? An example of the text is shown below.

tesseract input.png output

Answer 1

tesseract input.png - --psm 0 -c min_characters_to_try=10

Warning. Invalid resolution 0 dpi. Using 70 instead.
Page number: 0
Orientation in degrees: 180
Rotate: 180
Orientation confidence: 0.74
Script: Latin
Script confidence: 1.67

Answer 2

One simple approach to detect if text is rotated 180 degrees is to use the observation that text tends to be skewed towards the bottom. Here's the strategy:

Convert image to grayscale
Gaussian blur
Threshold image
Find the top/bottom half ROIs of thresholded image
Count non-zero array elements for each half

Threshold image

Find ROIs of top and bottom half

Next we split the top/bottom sections

With each half we count non-zero array elements using cv2.countNonZero() . We get this

('top', 4035)
('bottom', 3389)

By comparing the values between the two halves, if the top half has more pixels than the bottom half, it is upside down by 180 degrees. If it has less, it is correctly oriented.

Now that we have detected if it is upside down, we can rotate it using this function

def rotate(image, angle):
    # Obtain the dimensions of the image
    (height, width) = image.shape[:2]
    (cX, cY) = (width / 2, height / 2)

    # Grab the rotation components of the matrix
    matrix = cv2.getRotationMatrix2D((cX, cY), -angle, 1.0)
    cos = np.abs(matrix[0, 0])
    sin = np.abs(matrix[0, 1])

    # Find the new bounding dimensions of the image
    new_width = int((height * sin) + (width * cos))
    new_height = int((height * cos) + (width * sin))

    # Adjust the rotation matrix to take into account translation
    matrix[0, 2] += (new_width / 2) - cX
    matrix[1, 2] += (new_height / 2) - cY

    # Perform the actual rotation and return the image
    return cv2.warpAffine(image, matrix, (new_width, new_height))

Rotating the image

rotated = rotate(original_image, 180)
cv2.imshow("rotated", rotated)

which gives us the correct result

This is the pixel result if the image was correctly oriented

('top', 3209)
('bottom', 4206)

Full code

import numpy as np
import cv2

def rotate(image, angle):
    # Obtain the dimensions of the image
    (height, width) = image.shape[:2]
    (cX, cY) = (width / 2, height / 2)

    # Grab the rotation components of the matrix
    matrix = cv2.getRotationMatrix2D((cX, cY), -angle, 1.0)
    cos = np.abs(matrix[0, 0])
    sin = np.abs(matrix[0, 1])

    # Find the new bounding dimensions of the image
    new_width = int((height * sin) + (width * cos))
    new_height = int((height * cos) + (width * sin))

    # Adjust the rotation matrix to take into account translation
    matrix[0, 2] += (new_width / 2) - cX
    matrix[1, 2] += (new_height / 2) - cY

    # Perform the actual rotation and return the image
    return cv2.warpAffine(image, matrix, (new_width, new_height))

image = cv2.imread("1.PNG")
original_image = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (3,3), 0)
thresh = cv2.threshold(blurred, 110, 255, cv2.THRESH_BINARY_INV)[1]
cv2.imshow("thresh", thresh)

x, y, w, h = 0, 0, image.shape[1], image.shape[0]

top_half = ((x,y), (x+w, y+h/2))
bottom_half = ((x,y+h/2), (x+w, y+h))

top_x1,top_y1 = top_half[0]
top_x2,top_y2 = top_half[1]
bottom_x1,bottom_y1 = bottom_half[0]
bottom_x2,bottom_y2 = bottom_half[1]

# Split into top/bottom ROIs
top_image = thresh[top_y1:top_y2, top_x1:top_x2]
bottom_image = thresh[bottom_y1:bottom_y2, bottom_x1:bottom_x2]

cv2.imshow("top_image", top_image)
cv2.imshow("bottom_image", bottom_image)

# Count non-zero array elements
top_pixels = cv2.countNonZero(top_image)
bottom_pixels = cv2.countNonZero(bottom_image)

print('top', top_pixels)
print('bottom', bottom_pixels)

# Rotate if upside down
if top_pixels > bottom_pixels:
    rotated = rotate(original_image, 180)
    cv2.imshow("rotated", rotated)

cv2.waitKey(0)

Answer 3

I kind of liked the pytessaract solution.

import cv2 
import pytesseract
from scipy.ndimage import rotate as Rotate 

def float_convertor(x):
    if x.isdigit():
        out= float(x)
    else:
        out= x
    return out 

def tesseract_find_rotatation(img: str):
    img = cv2.imread(img) if isinstance(img, str) else img
    k = pytesseract.image_to_osd(img)
    out = {i.split(":")[0]: float_convertor(i.split(":")[-1].strip()) for i in k.rstrip().split("\n")}
    img_rotated = Rotate(img, 360-out["Rotate"])
    return img_rotated, out

usage

img_loc = ""
img_rotated, out = tessaract_find_rotation(img_loc)

How to detect if text is rotated 180 degrees or flipped upside down

Question

3 answers

solution1
3 2019-05-15 14:08:52

solution2
1 2019-05-14 23:26:19

solution3
0 2021-10-27 04:53:17

usage

How to detect if text is rotated 180 degrees or flipped upside down

Question

3 answers

solution1 3 2019-05-15 14:08:52

solution2 1 2019-05-14 23:26:19

solution3 0 2021-10-27 04:53:17

usage

solution1
3 2019-05-15 14:08:52

solution2
1 2019-05-14 23:26:19

solution3
0 2021-10-27 04:53:17