简体   繁体   English

Opencv、Python - 如何去除日期文本周围的灰色像素

[英]Opencv, Python - How to remove the gray pixels around the date text

I am trying to remove the grayish “noise” surrounding the dates using Python/OpenCV to help the OCR (Optical Character Recognition) to recognize the dates.我正在尝试使用 Python/OpenCV 消除日期周围的灰色“噪音”,以帮助OCR (光学字符识别)识别日期。

在此处输入图像描述

The original image looks like this: https://static.mothership.sg/1/2017/03/10-Feb-MC-1.jpg原始图像如下所示: https://static.mothership.sg/1/2017/03/10-Feb-MC-1.jpg

The python script I tried looked as below.我试过的 python 脚本如下所示。 However, I have other similar images in which the contrast or lighting coditions varies.但是,我还有其他类似的图像,其中对比度或照明条件会有所不同。

import cv2
import numpy as np

img = cv2.imread("mc.jpeg")
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

alpha = 3.5
beta = -2

new = alpha * img + beta
new = np.clip(new, 0, 255).astype(np.uint8)

cv2.imwrite("cleaned.png", new)

I also tried Thresholding and/or adaptiveThresholding and some time, I was able to separate the dates from the grayish background.我还尝试了Thresholding和/或adaptiveThresholding阈值,有一段时间,我能够将日期与灰色背景分开。 Sometimes it was very challenging.有时这是非常具有挑战性的。 I wonder is there an automatic way to determine the threshold value?我想知道是否有一种自动方法来确定阈值?

Below are example of what I hope to achieve.以下是我希望实现的示例。

在此处输入图像描述

在此处输入图像描述

Blurry Image:模糊图像: 在此处输入图像描述

Otsu's Binarization automatically calculates a threshold value from an image histogram. Otsu 的二值化会根据图像直方图自动计算阈值。

# Otsu's thresholding after Gaussian filtering
blur = cv2.GaussianBlur(img,(5,5),0)
ret,Otsu = cv2.threshold(blur,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)

cv2.imwrite("Otsu's_thresholding", Otsu)

see this link看到这个链接

You can try to build a model of the background and then weight each input pixel by that model.您可以尝试构建背景的 model,然后通过该 model 对每个输入像素进行加权。 The output gain should be relatively constant during most of the image. output 增益在大部分图像中应该是相对恒定的。 These are the steps for this method:这些是此方法的步骤:

  1. Apply a soft median blur filter to get rid of small noise应用软中值模糊过滤器以消除小噪声
  2. Get the model of the background via local maximum .通过局部最大值获取后台的model。 Apply a very strong close operation, with a big structuring element (I'm using a rectangular kernel of size 15 )应用非常强大的close操作, structuring element很大(我使用的是大小为15的矩形 kernel )
  3. Perform gain adjustment by dividing 255 between each local maximum pixel.通过在每个局部最大像素之间除以255来执行增益调整 Weight this value with each input image pixel.用每个输入图像像素加权这个值。
  4. You should get a nice image where the background illumination is pretty much normalized , threshold this image to get a binary mask of the text你应该得到一个很好的图像,其中背景照明非常标准化threshold这个图像以获得文本的二进制掩码

This is the code:这是代码:

import numpy as np
import cv2

# image path
path = "C:/opencvImages/sheet01.jpg"

# Read an image in default mode:
inputImage = cv2.imread(path)

# Remove small noise via median:
filterSize = 5
imageMedian = cv2.medianBlur(inputImage, filterSize)

# Get local maximum:
kernelSize = 15
maxKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
localMax = cv2.morphologyEx(imageMedian, cv2.MORPH_CLOSE, maxKernel, None, None, 1, cv2.BORDER_REFLECT101)

# Adjust image gain:
height, width, depth = localMax.shape

# Create output Mat:
outputImage = np.zeros(shape=[height, width, depth], dtype=np.uint8)

for i in range(0, height):

    for j in range(0, width):
        # Get current BGR pixels:
        v1 = inputImage[i, j]
        v2 = localMax[i, j]

        # Gain adjust:
        tempArray = []
        for c in range(0, 3):

            currentPixel = v2[c]
            if currentPixel != 0:
                gain = 255 / v2[c]
                gain = v1[c] * gain
            else:
                gain = 0

            # Gain set and clamp:
            tempArray.append(np.clip(gain, 0, 255))

        # Set pixel vec to out image:
        outputImage[i, j] = tempArray

# Convert RGB to grayscale:
grayscaleImage = cv2.cvtColor(outputImage, cv2.COLOR_BGR2GRAY)

# Threshold:
threshValue = 110
_, binaryImage = cv2.threshold(grayscaleImage, threshValue, 255, cv2.THRESH_BINARY)

# Write image:
imageFilename = "C:/opencvImages/binaryMask2.png"
cv2.imwrite(imageFilename, binaryImage)

I get the following results testing the complete image:我得到以下测试完整图像的结果:

And the cropped text:和裁剪的文本:

Please note that the gain adjustment operations are not vectorized .请注意,增益调整操作不是矢量化的 The script is slow, mainly because I'm starting with Python and don't know the proper Numpy syntax to speed-up this operation.脚本很慢,主要是因为我从Python开始并且不知道正确的Numpy语法来加速这个操作。 I've been using C++ for a long time, so feel free to further improve the code.我已经使用C++很长时间了,所以请随时进一步改进代码。

Edit:编辑:

Please, be aware that your result can only be as good as the quality of your input.请注意,您的结果只能与您输入的质量一样好。 See your input and ask yourself "Is this a good input for an automated process?"查看您的输入并问自己“对于自动化流程来说,这是一个好的输入吗?” (Automated processes are usually not very smart). (自动化流程通常不是很智能)。 The second picture you posted is very low quality.您发布的第二张图片质量非常低。 Not only is blurry but also is low res and has compression artifacts.不仅模糊,而且分辨率低,并且有压缩伪影。 All these factors will hinder automated processing.所有这些因素都会阻碍自动化处理。

With that said, here's an improvement you can include in the original: Try to normalize brightness-contrast on the grayscale output:话虽如此,您可以在原始文件中包含以下改进:尝试在灰度 output 上normalize亮度对比度:

grayscaleImage = np.uint8(cv2.normalize(grayscaleImage, grayscaleImage, 0, 255, cv2.NORM_MINMAX))

Your grayscale image goes from this:您的灰度图像来自:

to this:对此:

A little bit darker and improved on contrast.稍微暗一点,对比度有所改善。 Let's try to compute the optimal threshold value automatically via Otsu thresholding :让我们尝试通过Otsu thresholding自动计算最佳阈值:

threshValue, binaryImage = cv2.threshold(grayscaleImage, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)

It gets you this:它为您提供:

However, we can adjust the result if we add bias to Otsu's threshold, like this:但是,如果我们向 Otsu 的阈值添加bias ,我们可以调整结果,如下所示:

threshValue, binaryImage = cv2.threshold(grayscaleImage, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
bias = 0.9
threshValue = bias * threshValue
_, binaryImage = cv2.threshold(grayscaleImage, threshValue, 255, cv2.THRESH_BINARY)

That's the best quality you can get with these images using this method.这是使用此方法可以获得的这些图像的最佳质量。 If you find these suggestions and tips useful, please, at least up-vote my answer.如果您发现这些建议和技巧有用,请至少为我的回答投票。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM