简体   繁体   中英

get a lines above and below each line of text on a landscape image or boxes over text in an image without losing its resolution in python using OpenCV

Hi I have been trying to develop a tool to put my menu cards through OCR to digitize menu cards. Menu cards can be of various types where a portrait oriented page is divided into two or a landscape oriented page is divided into multiple columns of menu items. I have somehow managed to gather snippets from around here to process portrait oriented menu pages but when it comes to landscape orientation that code fails. if i remove the if condition for rotation of image then instead of giving me a result where the text in the menu card will be between two lines it just processes the image to remove noise thats all.Let me explain my problem with a few examples here. Please guide me through the process of processing menu's in the form of images to put them through OCR for digitization. I am using pytesseract for OCR and OpenCV for image processing.

this is what i am using to make underlines and overlines on text inside an image.

import cv2
import numpy as np


## (1) read
img = cv2.imread("out-1.jpg")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

## (2) threshold
th, threshed = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY_INV|cv2.THRESH_OTSU)

## (3) minAreaRect on the nozeros
pts = cv2.findNonZero(threshed)
ret = cv2.minAreaRect(pts)

(cx,cy), (w,h), ang = ret
if w>h:
    w,h = h,w
    ang += 90

## (4) Find rotated matrix, do rotation
M = cv2.getRotationMatrix2D((cx,cy), ang, 1.0)
rotated = cv2.warpAffine(threshed, M, (img.shape[1], img.shape[0]))

## (5) find and draw the upper and lower boundary of each lines
hist = cv2.reduce(rotated,1, cv2.REDUCE_AVG).reshape(-1)

th = 2
H,W = img.shape[:2]
uppers = [y for y in range(H-1) if hist[y]<=th and hist[y+1]>th]
lowers = [y for y in range(H-1) if hist[y]>th and hist[y+1]<=th]

rotated = cv2.cvtColor(rotated, cv2.COLOR_GRAY2BGR)
for y in uppers:
     cv2.line(rotated, (0,y), (W, y), (255,0,0), 1)

for y in lowers:
     cv2.line(rotated, (0,y), (W, y), (0,255,0), 1)

cv2.imwrite("processed1.png", rotated)

this is what i am using to make boxes around text in an image(this code is running fine but needs improvement as it reduces the resolution of an image while making boxes over text and also the box outlines are really thick so the text read by OCR is sometimes misread)

import cv2
import numpy as np

large = cv2.imread('out-1.jpg')
rgb = cv2.pyrDown(large)
small = cv2.cvtColor(rgb, cv2.COLOR_BGR2GRAY)

# kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3))
kernel = np.ones((5, 5), np.uint8)
grad = cv2.morphologyEx(small, cv2.MORPH_GRADIENT, kernel)

_, bw = cv2.threshold(grad, 0.0, 255.0, cv2.THRESH_BINARY | cv2.THRESH_OTSU)

kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (9, 1))
connected = cv2.morphologyEx(bw, cv2.MORPH_CLOSE, kernel)

# using RETR_EXTERNAL instead of RETR_CCOMP
#contours, hierarchy = cv2.findContours(connected.copy(), cv2.RETR_EXTERNAL,       cv2.CHAIN_APPROX_SIMPLE)
#For opencv 3+ comment the previous line and uncomment the following line
_, contours, hierarchy = cv2.findContours((connected.copy()), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)

mask = np.zeros(bw.shape, dtype=np.uint8)

for idx in range(len(contours)):
    x, y, w, h = cv2.boundingRect(contours[idx])
    mask[y:y+h, x:x+w] = 0
    cv2.drawContours(mask, contours, idx, (255, 255, 255), -1)
    r = float(cv2.countNonZero(mask[y:y+h, x:x+w])) / (w * h)

    if r > 0.45 and w > 8 and h > 8:
        cv2.rectangle(rgb, (x, y), (x+w-1, y+h-1), (0, 255, 0), 2)

cv2.imwrite('rec_output.jpg', rgb)

this is how the lines should be for landscape image as well but they do not work out

this is a landscape image example. here there should be lines as in the above image and clarity should not be compromised and the partitions inside the menu should be read correctly by OCR

when i add boxes over the text for better readability by OCR the resolution is compromised by the second code resulting in a rather poor readability and if i dont add boxes then the menu is read horizontally resulting in mix up of menu items and prices

cv2.pyrDown(large)

is reducing the resolution. Not sure why that is being used. Just removing that single line of code gave me the same output without compromising on the quality.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM