过滤图像以改善文本识别

Question

I have this source image below (after cropped) and I try to do some image processing before I read text. 我在下面有这个源图像（裁剪后），我尝试在阅读文本之前进行一些图像处理。

此搜索

With python and opencv, I tried to remove the lines in the background with k-means with k =2, and the result is 使用python和opencv，我尝试用k-means删除背景中的行，k = 2，结果是

镜像2

I tried to smooth the image using this code below 我尝试使用下面的代码来平滑图像

def process_image_for_ocr(file_path):
# TODO : Implement using opencv
temp_filename = set_image_dpi(file_path)
im_new = remove_noise_and_smooth(temp_filename)
return im_new


def set_image_dpi(file_path):
    im = Image.open(file_path)
    length_x, width_y = im.size
    factor = max(1, int(IMAGE_SIZE / length_x))
    size = factor * length_x, factor * width_y
    # size = (1800, 1800)
    im_resized = im.resize(size, Image.ANTIALIAS)
    temp_file = tempfile.NamedTemporaryFile(delete=False, suffix='.jpg')
    temp_filename = temp_file.name
    im_resized.save(temp_filename, dpi=(300, 300))
    return temp_filename


def image_smoothening(img):
    ret1, th1 = cv2.threshold(img, BINARY_THREHOLD, 255, cv2.THRESH_BINARY)
    ret2, th2 = cv2.threshold(th1, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
    blur = cv2.GaussianBlur(th2, (1, 1), 0)
    ret3, th3 = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
    return th3


def remove_noise_and_smooth(file_name):
    img = cv2.imread(file_name, 0)
    filtered = cv2.adaptiveThreshold(img.astype(np.uint8), 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 41, 3)
    kernel = np.ones((1, 1), np.uint8)
    opening = cv2.morphologyEx(filtered, cv2.MORPH_OPEN, kernel)
    closing = cv2.morphologyEx(opening, cv2.MORPH_CLOSE, kernel)
    img = image_smoothening(img)
    or_image = cv2.bitwise_or(img, closing)
    return or_image

And the result is 结果是

图像3

Can you help me (any idea) to remove the lines on the background of the source image? 你能帮我（任何想法）删除源图像背景上的线条吗？

Answer 1

One approach to achieve this is by computing a k-means unsupervised segmentation of the image. 实现此目的的一种方法是通过计算图像的k均值无监督分割。 You just need to play with the k and i_val values to get the desired output. 您只需要使用k和i_val值来获得所需的输出。

First, you need to create a function which will find the k threshold values.This simply calculates an image histogram which is used to compute the k_means. 首先，您需要创建一个能够找到k阈值的函数。这只是计算用于计算k_means的图像直方图。 .ravel() just converts your numpy array to a 1-D array. .ravel()只是将你的numpy数组转换为一维数组。 np.reshape(img, (-1,1)) then converts it to an 2-D array which is of shape n,1 . np.reshape(img, (-1,1))然后将其转换为形状为n,1的二维数组。 Next we carry out the k_means as described here . 接下来我们执行这里描述的k_means。

The function takes the input gray-scale image, your number of k intervals and the value you want to threshold from ( i_val ). 该函数从（ i_val ）获取输入灰度图像， k间隔的数量和要阈值的值。 It returns the threshold value at your desired i_val . 它返回所需i_val的阈值。

def kmeans(input_img, k, i_val):
    hist = cv2.calcHist([input_img],[0],None,[256],[0,256])
    img = input_img.ravel()
    img = np.reshape(img, (-1, 1))
    img = img.astype(np.float32)

    criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)
    flags = cv2.KMEANS_RANDOM_CENTERS
    compactness,labels,centers = cv2.kmeans(img,k,None,criteria,10,flags)
    centers = np.sort(centers, axis=0)

    return centers[i_val].astype(int), centers, hist

img = cv2.imread('Y8CSE.jpg', 0)
_, thresh = cv2.threshold(img, kmeans(input_img=img, k=8, i_val=2)[0], 255, cv2.THRESH_BINARY)
cv2.imwrite('text.png',thresh)

The output for this looks like: 这个输出看起来像：

You could carry on with this method by using morphological operators , or pre-mask the image using a hough transform as seen in the first answer here . 你可以继续通过使用这种方法形态学算，或使用霍夫变换作为第一个答案看到前光罩图像这里。

过滤图像以改善文本识别

问题描述

1 个解决方案

解决方案1
3 已采纳 2018-07-31 08:59:59

过滤图像以改善文本识别

问题描述

1 个解决方案

解决方案1 3 已采纳 2018-07-31 08:59:59

解决方案1
3 已采纳 2018-07-31 08:59:59