简体   繁体   中英

Filtering Image For Improving Text Recognition

I have this source image below (after cropped) and I try to do some image processing before I read text.

此搜索

With python and opencv, I tried to remove the lines in the background with k-means with k =2, and the result is

镜像2

I tried to smooth the image using this code below

def process_image_for_ocr(file_path):
# TODO : Implement using opencv
temp_filename = set_image_dpi(file_path)
im_new = remove_noise_and_smooth(temp_filename)
return im_new


def set_image_dpi(file_path):
    im = Image.open(file_path)
    length_x, width_y = im.size
    factor = max(1, int(IMAGE_SIZE / length_x))
    size = factor * length_x, factor * width_y
    # size = (1800, 1800)
    im_resized = im.resize(size, Image.ANTIALIAS)
    temp_file = tempfile.NamedTemporaryFile(delete=False, suffix='.jpg')
    temp_filename = temp_file.name
    im_resized.save(temp_filename, dpi=(300, 300))
    return temp_filename


def image_smoothening(img):
    ret1, th1 = cv2.threshold(img, BINARY_THREHOLD, 255, cv2.THRESH_BINARY)
    ret2, th2 = cv2.threshold(th1, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
    blur = cv2.GaussianBlur(th2, (1, 1), 0)
    ret3, th3 = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
    return th3


def remove_noise_and_smooth(file_name):
    img = cv2.imread(file_name, 0)
    filtered = cv2.adaptiveThreshold(img.astype(np.uint8), 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 41, 3)
    kernel = np.ones((1, 1), np.uint8)
    opening = cv2.morphologyEx(filtered, cv2.MORPH_OPEN, kernel)
    closing = cv2.morphologyEx(opening, cv2.MORPH_CLOSE, kernel)
    img = image_smoothening(img)
    or_image = cv2.bitwise_or(img, closing)
    return or_image

And the result is

图像3

Can you help me (any idea) to remove the lines on the background of the source image?

One approach to achieve this is by computing a k-means unsupervised segmentation of the image. You just need to play with the k and i_val values to get the desired output.

First, you need to create a function which will find the k threshold values.This simply calculates an image histogram which is used to compute the k_means. .ravel() just converts your numpy array to a 1-D array. np.reshape(img, (-1,1)) then converts it to an 2-D array which is of shape n,1 . Next we carry out the k_means as described here .

The function takes the input gray-scale image, your number of k intervals and the value you want to threshold from ( i_val ). It returns the threshold value at your desired i_val .

def kmeans(input_img, k, i_val):
    hist = cv2.calcHist([input_img],[0],None,[256],[0,256])
    img = input_img.ravel()
    img = np.reshape(img, (-1, 1))
    img = img.astype(np.float32)

    criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)
    flags = cv2.KMEANS_RANDOM_CENTERS
    compactness,labels,centers = cv2.kmeans(img,k,None,criteria,10,flags)
    centers = np.sort(centers, axis=0)

    return centers[i_val].astype(int), centers, hist

img = cv2.imread('Y8CSE.jpg', 0)
_, thresh = cv2.threshold(img, kmeans(input_img=img, k=8, i_val=2)[0], 255, cv2.THRESH_BINARY)
cv2.imwrite('text.png',thresh)

The output for this looks like:

K-MEANS门槛

You could carry on with this method by using morphological operators , or pre-mask the image using a hough transform as seen in the first answer here .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM