简体   繁体   English

拆分文本和背景作为OCR的预处理(Tesseract)

[英]Splitting text and background as preprocess of OCR (Tesseract)

I am applying OCR against text in TV footage. 我正在对电视画面中的文本应用OCR。 (I am using Tesseact 3.x w/ C++ ) I am trying to split text and background part as a preprocessing of OCR. (我正在使用带C++ Tesseact 3.x )我试图将文本和背景部分拆分为OCR的预处理。

With usual footage, text and background is highly contrasted (such as white against black) so that modifying gamma would do the job. 与通常的素材相比,文本和背景形成了鲜明的对比(例如白色对黑色),因此修改gamma即可胜任。 However, this attached image (yellow text with background of orange/red sky) is giving me hard time to do preprocessing. 但是,此附加图像(带有橙色/红色天空背景的黄色文本)使我很难进行预处理。

黄色文本在橙色的天空

What would be a good way to split this yellow text from background? 将黄色文本与背景分开的好方法是什么?

Below is a simple solution by using Python 2.7 , OpenCV 3.2.0 and Tesseract 4.0.0a . 以下是使用Python 2.7OpenCV 3.2.0Tesseract 4.0.0a的简单解决方案。 Convert Python to C++ for OpenCV should be not difficult, then call tesseract API to perform OCR. 对于OpenCVPython转换为C++应该不难,然后调用tesseract API执行OCR。

import numpy as np
import cv2
import matplotlib.pyplot as plt
%matplotlib inline 

def show(title, img, color=True):
    if color:
        plt.imshow(img[:,:,::-1]), plt.title(title), plt.show()
    else:
        plt.imshow(img, cmap='gray'), plt.title(title), plt.show()

def ocr(img):
    # I used a version of OpenCV with Tesseract binding. Modes set to:
    #   Page Segmentation mode (PSmode) = 11 (defualt = 3)
    #   OCR Enginer Mode (OEM) = 3 (defualt = 3)
    tesser = cv2.text.OCRTesseract_create('C:/Program Files/Tesseract 4.0.0/tessdata/','eng', \
                                          'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz',3,3)
    retval = tesser.run(img, 0) # return text string type
    print 'OCR Output: ' + retval

img = cv2.imread('./imagesStackoverflow/yellow_text.png')
show('original', img)

# apply GaussianBlur to smooth image, then threshholds yellow to white (255,255, 255)
# and sets the rest to black(0,0,0)
img = cv2.GaussianBlur(img,(5,5), 1) # smooth image
mask = cv2.inRange(img,(40,180,200),(70,220,240)) # filter out yellow color range, low and high range
show('mask', mask, False)

# invert the image to have text black-in-white
res = 255 - mask
show('result', res, False)

# pass to tesseract to perform OCR
ocr(res)

Processed Images and OCR Output (see last line in image): 已处理的图像和OCR输出(请参阅图像的最后一行):

在此处输入图片说明

Hope this help. 希望能有所帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM