简体   繁体   中英

How to detect and remove borders from framed forms in opencv python?

Data Description

  • Dataset contains forms taken as pictures (so data quality varies greatly)
  • All forms follow the same template

Here's an example of the form element with borders:

图片示例

Aim

Detect, and therefore remove (approximately) rectangular borders or frames around image.

Challenges

Due to shading effects etc, the borders may not be of uniform colour, and may include or be partially interrupted by symbols. Not all images will actually have borders in the first place (in which case nothing needs removing).

References

Issue has been described previously by others on this link , and answers have been provided in C++. As I am not fluent in the language, I need to do it in Python.

The referenced answer descibed the following steps (and since I am just beginning in Computer Vision, I am unsure of what they mean):

  1. compute Laplacian of image
  2. compute horizontal & vertical projection
  3. evaluation of changes in both directions
  4. find the maximum peak, find in the side of the gradient image.

1 - You will have to make some assumption regarding the borders that IF they are present -say they should not be more than 20 pixels or say 10% of the height/width of image. Seeing your data you will be able to make this assumption

Now we will isolate this 20 pixel border area from image and work only in that.

2 - Convert image to grayscale since your color of border varies. Working on grayscale will make life easy. If you can threshold it, it will be even better.

import cv2 
import numpy as np 
img = cv2.imread('input.png', 0) 
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

3 - Since your image borders can be partially interrupted by symbols - use dilation operation. If there is uniform border or no border present - nothing will happen. If border is present and interrupted - dilation operation will make it uniform.

Taking a matrix of size 5 as the kernel

kernel = np.ones((5,5), np.uint8)    
img_dilated = cv2.dilate(gray, kernel, iterations=1)

You will need to experiment with

  • kernel size
  • number of iterations
  • Whether erosion operation is required after dilation. Erosion is opposite of dilation

4 - Now let us find out if there is any border at all using Laplacian. The Laplacian is a 2-D isotropic measure of the 2nd spatial derivative of an image. The Laplacian of an image highlights regions of rapid intensity change and is therefore often used for edge detection.

laplacian = cv2.Laplacian(img_dilated,cv2.CV_64F)

In Laplacian of your image, you'll see two lines in place of your border. Note - You do not need to use separate horizontal and vertical sobel operators. Laplacian takes care of both horizontal and vertical. Laplacian is 2nd order derivative while sobel is 1st order.

5 - Now you would want algo to detect whether there is any double line. For that we use Hough transform.

# This returns an array of r and theta values 
lines = cv2.HoughLines(edges,1,np.pi/180, 200) 
  
# The below for loop runs till r and theta values  
# are in the range of the 2d array 
for r,theta in lines[0]: 
      
    # Stores the value of cos(theta) in a 
    a = np.cos(theta) 
  
    # Stores the value of sin(theta) in b 
    b = np.sin(theta) 

6 - If Hough transform detects lines (check the angle theta above vs expectations with some tolerance) - that means your border is present. Delete that 20 pixel border from your image.

Note - this is just a pseudo-code to get you started. Real world problems require lots of custom work and experimentation.

I managed to find a way that worked for me, although it doesn't work if there are other horizontal and vertical shapes in the image.

The idea I used was to simply start from the assumption that borders are horizontal and vertical shapes and go from the assumption that these only exist in the borders (meaning that the image itself has no vertical or horizontal line, which is a stretch, but my use case had that assumption).

Here's the codes I used:

# extract horizontal and vertical lines
only_box = extract_all_squares(box, kernel_length=7)
# build up a mask of the same size as the image
mask = np.zeros(box.shape, dtype='uint8')
# get contours of horizontal and vetical lines
contours, hierarchy = cv2.findContours(only_box, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
# draw contours on mask
mask = cv2.drawContours(mask, contours, -1, (255, 255, 255), thickness=cv2.FILLED)
# threhold mask and image
ret, mask = cv2.threshold(mask, 20, 255, cv2.THRESH_BINARY)
ret, box = cv2.threshold(box, 20, 255, cv2.THRESH_BINARY)
# remove the bits we don't want
box[mask == 0] = 255

With the following helper functions

def extract_all_squares(image, kernel_length):
    """
    Binarizes image, keeping only vertical and horizontal lines
    hopefully, it'll help us detect squares
    Args:
        image: image (cropped around circonstances)
        kernel_length: length of kernel to use. Too long and you will catch everything,
            too short and you catch nothing
    Returns:
        image binarized and keeping only vertical and horizozntal lines
    """
    # thresholds image : anything beneath a certain value is set to zero
    (thresh, img_bin) = cv2.threshold(image, 128, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)

    # A vertical kernel of (1 X kernel_length), which will detect all the verticle lines from the image.
    vertical_ksize = (1, kernel_length)
    # Morphological operation to detect vertical lines from an image
    verticle_lines_img = extract_lines(img_bin, vertical_ksize)

    # A horizontal kernel of (kernel_length X 1), which will help to detect all the horizontal line from the image.
    horizontal_ksize = (kernel_length, 1)
    # Morphological operation to detect horizontal lines from an image
    horizontal_lines_img = extract_lines(img_bin, horizontal_ksize)
    img_final_bin = add_lines_together(verticle_lines_img, horizontal_lines_img)

    return img_final_bin


def extract_lines(image, ksize):
    """
    extract lines (horizontal or vertical, depending on ksize)
    Args:
        image: binarized image
        ksize: size of kernel to use. Possible values :
            horizontal_ksize = (kernel_length, 1)
            vertical_ksize = (1, kernel_length)
    Returns:
        lines from image (vertical or horizontal, depending on ksize)
    """
    kernel = cv2.getStructuringElement(cv2.MORPH_RECT, ksize)
    img_temp = cv2.erode(image, kernel, iterations=3)
    lines_img = cv2.dilate(img_temp, kernel, iterations=3)
    return lines_img


def add_lines_together(verticle_lines_img, horizontal_lines_img, alpha=0.5, beta=0.5):
    """
    extract lines (horizontal or vertical, depending on ksize)
    Args:
        verticle_lines_img: image with vertical lines
        horizontal_lines_img: image with horizontal lines
        alpha : weight of first image. Keep at 0.5 for balance
        beta : weight of second image. Keep at 0.5 for balance
            alpha and beta are weighting parameters, this will
            decide the quantity of an image to be added to make a new image
    Returns:
        image with an addition of both vertical and horizontal lines
    """

    # This function helps to add two image with specific weight parameter to get a third image as summation of two image.
    img_final_bin = cv2.addWeighted(verticle_lines_img, alpha, horizontal_lines_img, beta, 0.0)
    # A kernel of (3 X 3) nes.
    kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))
    # erodes boundaries of features, gets rid of some noise
    img_final_bin = cv2.erode(~img_final_bin, kernel, iterations=2)
    # further kill noise by thresholding
    (thresh, img_final_bin) = cv2.threshold(img_final_bin, 128, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
    return img_final_bin

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM