简体   繁体   中英

How to implement imbinarize in OpenCV

I developed script in Matlab which is analysing engraved text on a colour steal. I'm using range of morphological techniques to extract the text and read it with OCR. I need to implement it on Raspberry Pi therefore I decided to transfer my Matlab code into OpenCV (in python). I tried to transfer some methods and they work similarly but how do I implement imreconstruct and imbinarize (shown below) to OpenCV? (the challenge here is appropriate differentiate foreground and background).

Maybe I should try adding grabCut or getStructuringElement or morphologyEx or dilate ? I tried them in range of combinations but have not found a perfect solution.

I will put the whole script for both if anyone could give me suggestions on how to generally improve this extraction and accuracy of OCR process I would greatly appreciate it.

Based on bin values of grey-scale image. I change some parameters in those functions:

Matlab:

se = strel('disk', 300);
img = imtophat(img, se);
maker = imerode(img, strel('line',100,0)); %for whiter ones
maker = imerode(img, strel('line',85,0)); %for medium
maker = imerode(img, strel('line',5,0));

imgClear = imreconstruct(maker, img);

imgBlur = imgaussfilt(imgClear,1); %less blur for whiter frames

BW = imbinarize(imgBlur,'adaptive','ForegroundPolarity','Bright',...
    'Sensitivity',0.7);   %process for medium

BW = imbinarize(imgBlur, 'adaptive', 'ForegroundPolarity',...
        'Dark', 'Sensitivity', 0.4); % process for black and white

res = ocr(BW, 'CharacterSet', '0123456789', 'TextLayout', 'Block');
res.Text;

OpenCv

kernel = numpy.ones((5,5),numpy.uint8)

blur = cv2.GaussianBlur(img,(5,5),0)
erosion = cv2.erode(blur,kernel,iterations = 1)
opening = cv2.morphologyEx(erosion, cv2.MORPH_OPEN, kernel)

#bremove = cv2.grabCut(opening,mask,rect,bgdModelmode==GC_INIT_WITH_MASK)
#th3 = cv2.adaptiveThreshold(opening,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU,11,2)

ret, thresh= cv2.threshold(opening,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)

ocr = pytesseract.image_to_string(Image.open('image2.png'),config='stdout -c tessedit_char_whitelist=0123456789')

Here is the input image:

在此处输入图片说明

Matlab结果(结果573702)

OpenCV结果(结果573102

浅色图像

Matlab程序全检率

I worked out the code to have a positive result based on your engraved text sample.

import cv2
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

def show(img):
    plt.imshow(img, cmap="gray")
    plt.show()

# load the input image
img = cv2.imread('./imagesStackoverflow/engraved_text.jpg',0);
show(img)

ret, mask = cv2.threshold(img, 60, 120, cv2.THRESH_BINARY)  # turn 60, 120 for the best OCR results
kernel = np.ones((5,3),np.uint8)
mask = cv2.erode(mask,kernel,iterations = 1)
show(mask)

# I used a version of OpenCV with Tesseract, you may use your pytesseract and set the modes as:
#   OCR Enginer Mode (OEM) = 3 (defualt = 3)
#   Page Segmentation mode (PSmode) = 11 (defualt = 3)
tesser = cv2.text.OCRTesseract_create('C:/Program Files/Tesseract 4.0.0/tessdata/','eng','0123456789',11,3)
retval = tesser.run(mask, 0) # return string type

print 'OCR:' + retval

Processed image and OCR output:

在此处输入图片说明

It would be great if you can feedback your test results with more sample images.

I am surprised at how much difference between matlab and opencv there is when they both appear to use the same algorithm. Why do you run imbinarize twice? What does the sensitivity keyword actually do (mathematically, behind the background). Because they obviously have several steps more than just the bare OTSU.

import cv2
import numpy as np
import matplotlib.pyplot as plt

def show(img):
    plt.imshow(img, cmap="gray")
    plt.show()

img = cv2.imread("letters.jpg", cv2.IMREAD_GRAYSCALE)

kernel = np.ones((3,3), np.uint8)

blur = cv2.GaussianBlur(img,(3,3), 0)
erosion = cv2.erode(blur, kernel, iterations=3)
opening = cv2.dilate(erosion, kernel)

th3 = cv2.adaptiveThreshold(opening, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
            cv2.THRESH_BINARY, 45, 2)
show(th3)

kernel2 = cv2.getGaussianKernel(6, 2) #np.ones((6,6))
kernel2 = np.outer(kernel2, kernel2)
th3 = cv2.dilate(th3, kernel2)
th3 = cv2.erode(th3, kernel)
show(th3)

The images that get displayed are:

第一张图片,阈值化的直接结果

After a bit of cleaning up:

多一点清理和倾斜。确实不如matlab输出好。

So all in all not the same and certainly not as nice as matlab. But the basic principle seems the same, it's just that the numbers need playing with.

A better approach would probably be to do a threshold by the mean of the image and then use the output of that as a mask to adaptive threshold the original image. Hopefully then the results would be better than both opencv and matlab.

Try doing it with ADAPTIVE_THRESH_MEAN_C you can get some really nice results but there's more trash lying around. Again, maybe if you can use it as a mask to isolate the text and then do tresholding again it might turn out to be better. Also the shape of the erosion and dilation kernels will make a big difference here.

What I can see from your code is you have used tophat filtering in your Matlab code as the first step. However, I couldn't see the same in your python OpenCV code. Python has built in tophat filter try applying that for getting similar result

kernel = np.ones((5,5),np.uint8) tophat = cv2.morphologyEx(img, cv2.MORPH_TOPHAT, kernel)

Also, try using CLAHE it gives better contrast to your image and then apply blackhat to filter out small details.

clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8)) cl1 = clahe.apply(img) I have got better results by applying these transformations.

Tried below, it works to recognize the lighter engraved text sample. Hope it helps.

def show(img):
    plt.imshow(img, cmap="gray")
    plt.show()

# load the input image
img = cv2.imread('./imagesStackoverflow/engraved_text2.jpg',0);
show(img)

# apply CLAHE to adjust the contrast
clahe = cv2.createCLAHE(clipLimit=5.1, tileGridSize=(5,3))
cl1 = clahe.apply(img)
img = cl1.copy()
show(img)

img = cv2.GaussianBlur(img,(3,3), 1)
ret, mask = cv2.threshold(img, 125, 150, cv2.THRESH_BINARY)  # turn 125, 150 for the best OCR results

kernel = np.ones((5,3),np.uint8)
mask = cv2.erode(mask,kernel,iterations = 1)
show(mask)

# I used a version of OpenCV with Tesseract, you may use your pytesseract and set the modes as:
#   Page Segmentation mode (PSmode) = 11 (defualt = 3)
#   OCR Enginer Mode (OEM) = 3 (defualt = 3)
tesser = cv2.text.OCRTesseract_create('C:/Program Files/Tesseract 4.0.0/tessdata/','eng','0123456789',11,3)
retval = tesser.run(mask, 0) # return string type

print 'OCR:' + retval

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM