I want to identify and highlight / crop the text between two lines using Python (cv2).
One line is a wavy line at the top, and the second line somewhere in the page. This line can appear at any height on the page, ranging from just after 1 line to just before the last line.
An example,
I believe I need to use HoughLinesP()
somehow with proper parameters for this. I've tried some examples involving a combination of erode
+ dilate
+ HoughLinesP
.
eg
img = cv2.imread(image)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
kernel_size = 5
blur_gray = cv2.GaussianBlur(gray, (kernel_size, kernel_size), 0)
# erode / dilate
erode_kernel_param = (5, 200) # (5, 50)
dilate_kernel_param = (5, 5) # (5, 75)
img_erode = cv2.erode(blur_gray, np.ones(erode_kernel_param))
img_dilate = cv2.dilate(img_erode, np.ones(dilate_kernel_param))
# %% Second, process edge detection use Canny.
low_threshold = 50
high_threshold = 150
edges = cv2.Canny(img_dilate, low_threshold, high_threshold)
# %% Then, use HoughLinesP to get the lines.
# Adjust the parameters for better performance.
rho = 1 # distance resolution in pixels of the Hough grid
theta = np.pi / 180 # angular resolution in radians of the Hough grid
threshold = 15 # min number of votes (intersections in Hough grid cell)
min_line_length = 600 # min number of pixels making up a line
max_line_gap = 20 # max gap in pixels between connectable line segments
line_image = np.copy(img) * 0 # creating a blank to draw lines on
# %% Run Hough on edge detected image
# Output "lines" is an array containing endpoints of detected line segments
lines = cv2.HoughLinesP(edges, rho, theta, threshold, np.array([]),
min_line_length, max_line_gap)
if lines is not None:
for line in lines:
for x1, y1, x2, y2 in line:
cv2.line(line_image, (x1, y1), (x2, y2), (255, 0, 0), 5)
# %% Draw the lines on the image
lines_edges = cv2.addWeighted(img, 0.8, line_image, 1, 0)
However, in many cases the lines dont get identified propery. Some examples of errors being,
Am I on the right track? Do I just need to hit the correct combination of parameters for this purpose? or is there a simpler way / trick which will let me reliably crop the text between these two lines?
In case it's relevant, I need to do this for ~450 pages. Here's the link to the book, in case someone wants to examine more examples of pages. https://archive.org/details/in.ernet.dli.2015.553713/page/n13/mode/2up
Thank you.
Solution
I've made minor modifications to the answer by Ari (Thank you), and made the code a bit more comprehensible for my own sake, here's my code.
The core idea is,
for image in images:
base_img = cv2.imread(image)
height, width, channels = base_img.shape
img = cv2.cvtColor(base_img, cv2.COLOR_BGR2GRAY)
ret, img = cv2.threshold(img, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
img = cv2.bitwise_not(img)
contours, hierarchy = cv2.findContours(
img, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE
)
# Get rectangle bounding contour
rects = [cv2.boundingRect(contour) for contour in contours]
# Rectangle is (x, y, w, h)
# Top-Left point of the image is (0, 0), rightwards X, downwards Y
# Sort the contours bigger width first
rects.sort(key=lambda r: r[2], reverse=True)
# Get the 2 "widest" rectangles
line_rects = rects[:2]
line_rects.sort(key=lambda r: r[1])
# If at least two rectangles (contours) were found
if len(line_rects) >= 2:
top_x, top_y, top_w, top_h = line_rects[0]
bot_x, bot_y, bot_w, bot_h = line_rects[1]
# Cropping the img
# Crop between bottom y of the upper rectangle (i.e. top_y + top_h)
# and the top y of lower rectangle (i.e. bot_y)
crop_img = base_img[top_y+top_h:bot_y]
# Highlight the area by drawing the rectangle
# For full width, 0 and width can be used, while
# For exact width (erroneous) top_x and bot_x + bot_w can be used
rect_img = cv2.rectangle(
base_img,
pt1=(0, top_y + top_h),
pt2=(width, bot_y),
color=(0, 255, 0),
thickness=2
)
cv2.imwrite(image.replace('.jpg', '.rect.jpg'), rect_img)
cv2.imwrite(image.replace('.jpg', '.crop.jpg'), crop_img)
else:
print(f"Insufficient contours in {image}")
You can find the Contours, and then take the two with the biggest width.
base_img = cv2.imread('a.png')
img = cv2.cvtColor(base_img, cv2.COLOR_BGR2GRAY)
ret, img = cv2.threshold(img, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
img = cv2.bitwise_not(img)
cnts, hierarchy = cv2.findContours(img, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
# sort the cnts bigger width first
cnts.sort(key=lambda c: cv2.boundingRect(c)[2], reverse=True)
# get the 2 big lines
lines = [cv2.boundingRect(cnts[0]), cv2.boundingRect(cnts[1])]
# higher line first
lines.sort(key=lambda c: c[1])
# croping the img
crop_img = base_img[lines[0][1]:lines[1][1]]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.