简体   繁体   English

用于 OCR 的清洁图像

[英]Cleaning image for OCR

I've been trying to clear images for OCR: (the lines)我一直在尝试清除 OCR 图像:(线条)

在此处输入图片说明

I need to remove these lines to sometimes further process the image and I'm getting pretty close but a lot of the time the threshold takes away too much from the text:我需要删除这些行以有时进一步处理图像并且我已经非常接近但很多时间阈值从文本中带走了太多:

    copy = img.copy()
    blur = cv2.GaussianBlur(copy, (9,9), 0)
    thresh = cv2.adaptiveThreshold(blur,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV,11,30)

    kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (9,9))
    dilate = cv2.dilate(thresh, kernel, iterations=2)

    cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    cnts = cnts[0] if len(cnts) == 2 else cnts[1]

    for c in cnts:
        area = cv2.contourArea(c)
        if area > 300:
            x,y,w,h = cv2.boundingRect(c)
            cv2.rectangle(copy, (x, y), (x + w, y + h), (36,255,12), 3)

Edit: Additionally, using constant numbers will not work in case the font changes.编辑:此外,如果字体发生变化,使用常量数字将不起作用。 Is there a generic way to do this?有没有通用的方法来做到这一点?

Here's an idea.这是一个想法。 We break this problem up into several steps:我们将这个问题分解为几个步骤:

  1. Determine average rectangular contour area.确定平均矩形轮廓面积。 We threshold then find contours and filter using the bounding rectangle area of the contour.然后我们阈值查找轮廓并使用轮廓的边界矩形区域进行过滤。 The reason we do this is because of the observation that any typical character will only be so big whereas large noise will span a larger rectangular area.我们这样做的原因是因为观察到任何典型的字符都只会如此之大,而大噪声将跨越更大的矩形区域。 We then determine the average area.然后我们确定平均面积。

  2. Remove large outlier contours.删除大的离群值轮廓。 We iterate through contours again and remove the large contours if they are 5x larger than the average contour area by filling in the contour.我们再次遍历轮廓,如果大轮廓比平均轮廓区域大5x ,则通过填充轮廓来删除它们。 Instead of using a fixed threshold area, we use this dynamic threshold for more robustness.我们没有使用固定的阈值区域,而是使用这个动态阈值来提高鲁棒性。

  3. Dilate with a vertical kernel to connect characters .用垂直内核扩张以连接字符 The idea is take advantage of the observation that characters are aligned in columns.这个想法是利用字符在列中对齐的观察结果。 By dilating with a vertical kernel we connect the text together so noise will not be included in this combined contour.通过使用垂直内核进行扩张,我们将文本连接在一起,因此该组合轮廓中不会包含噪声。

  4. Remove small noise .去除小噪音 Now that the text to keep is connected, we find contours and remove any contours smaller than 4x the average contour area.现在要保留的文本已连接,我们找到轮廓并删除任何小于平均轮廓面积4x的轮廓。

  5. Bitwise-and to reconstruct image .按位与重建图像 Since we only have desired contours to keep on our mask, we bitwise-and to preserve the text and get our result.由于我们只有想要的轮廓来保留在我们的掩码上,我们按位保留文本并得到我们的结果。


Here's a visualization of the process:这是该过程的可视化:

We Otsu's threshold to obtain a binary image then find contours to determine the average rectangular contour area.我们用大津的阈值来获得二值图像,然后找到轮廓来确定平均矩形轮廓区域。 From here we remove the large outlier contours highlighted in green by filling contours从这里我们通过填充轮廓去除以绿色突出显示的大的离群值轮廓

在此处输入图片说明 在此处输入图片说明

Next we construct a vertical kernel and dilate to connect the characters.接下来我们构建一个垂直内核膨胀以连接字符。 This step connects all the desired text to keep and isolates the noise into individual blobs.此步骤连接所有所需的文本以保留并将噪声隔离为单个 blob。

在此处输入图片说明

Now we find contours and filter using contour area to remove the small noise现在我们找到轮廓并使用轮廓区域过滤以去除小噪声

在此处输入图片说明

Here are all the removed noise particles highlighted in green以下是以绿色突出显示的所有移除的噪声粒子

在此处输入图片说明

Result结果

在此处输入图片说明

Code代码

import cv2

# Load image, grayscale, and Otsu's threshold
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# Determine average contour area
average_area = [] 
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
    x,y,w,h = cv2.boundingRect(c)
    area = w * h
    average_area.append(area)

average = sum(average_area) / len(average_area)

# Remove large lines if contour area is 5x bigger then average contour area
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
    x,y,w,h = cv2.boundingRect(c)
    area = w * h
    if area > average * 5:  
        cv2.drawContours(thresh, [c], -1, (0,0,0), -1)

# Dilate with vertical kernel to connect characters
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2,5))
dilate = cv2.dilate(thresh, kernel, iterations=3)

# Remove small noise if contour area is smaller than 4x average
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
    area = cv2.contourArea(c)
    if area < average * 4:
        cv2.drawContours(dilate, [c], -1, (0,0,0), -1)

# Bitwise mask with input image
result = cv2.bitwise_and(image, image, mask=dilate)
result[dilate==0] = (255,255,255)

cv2.imshow('result', result)
cv2.imshow('dilate', dilate)
cv2.imshow('thresh', thresh)
cv2.waitKey()

Note: Traditional image processing is limited to thresholding, morphological operations, and contour filtering (contour approximation, area, aspect ratio, or blob detection).注意:传统的图像处理仅限于阈值、形态学操作和轮廓过滤(轮廓近似、面积、纵横比或斑点检测)。 Since input images can vary based on character text size, finding a singular solution is quite difficult.由于输入图像可能因字符文本大小而异,因此很难找到单一的解决方案。 You may want to look into training your own classifier with machine/deep learning for a dynamic solution.您可能需要考虑使用机器/深度学习来训练自己的分类器以获得动态解决方案。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM