简体   繁体   English

当灰度图像文本颜色“干扰”背景颜色时,如何优化 C# 中 Tesseract 的图像预处理?

[英]How to optimal preprocess images for Tesseract in C#, when grayscaled image text color "interferes" with the background color?

I'm struggling with finding a optimal binarization as preprocessing step for OCR (tesseract in C#).我正在努力寻找最佳二值化作为 OCR 的预处理步骤(C# 中的 tesseract)。

The images are 1624 X 1728 of pixel size and contain car gui elements (Buttons, Sliders, Info Boxes) and corresponding text from a car navigation command interface generation (different use case scenarios like radio control, car control, etc.).这些图像为 1624 X 1728 像素大小,包含汽车 gui 元素(按钮、滑块、信息框)和来自汽车导航命令界面生成的相应文本(不同的用例场景,如无线电控制、汽车控制等)。 The images contain multiple colors, most of images are dark blue, and the text is white/gray or close to white.图片包含多张colors,大部分图片为深蓝色,文字为白色/灰色或接近白色。 Unfortunately, I cannot share the images due to data privacy.不幸的是,由于数据隐私,我无法分享这些图像。

Problem: I cannot separate the text from the background in a efficent way (text to be black, everything else to be white), because the text color has a high range and is partialy the same with the background color (speaking of grayscaled images).问题:我无法以有效的方式将文本与背景分开(文本为黑色,其他所有内容为白色),因为文本颜色范围很大并且与背景颜色部分相同(谈到灰度图像) .

Actual procedure: First I convert the RGB Image from System.Drawing.Image to OpenCvSharp.Mat.实际过程:首先我将 RGB 图像从 System.Drawing.Image 转换为 OpenCvSharp.Mat。 Then I convert the Mat image from colored to gray and then from gray to binarized.然后我将 Mat 图像从彩色转换为灰色,然后从灰色转换为二值化。

This is the main code for the binarization:这是二值化的主要代码:

Mat binarized = grayscaled.Threshold(tresh, maxVal, ThresholdTypes.BinaryInv);

I use 255 as maxVal .我使用255 作为 maxVal If I use tresh=90 , the binarized image looks ok overall (even if tesseract results are bad here), but some pixels of the bottom control elements text (and some other text) are white, because the tresh is too high (so some text characters are unsharp and not complete).如果我使用tresh=90 ,二值化图像整体看起来还可以(即使这里的 tesseract 结果很糟糕),但是底部控制元素文本(和其他一些文本)的一些像素是白色的,因为 tresh 太高了(所以有些文本字符不清晰且不完整)。

If I use like tresh = 40 , the characters of the bottom control elements become complete and sharp (as the should be), but the background (middle of the image) gets completely black, which means that some text in there disappears inside of a big black chunk.如果我使用 like tresh = 40 ,底部控制元素的字符变得完整且清晰(应该如此),但背景(图像中间)完全变黑,这意味着其中的一些文本消失在 a大黑块。 So the problem is a high text pixel color range inside of the grayscaled image that "interferes" with the colors of other elements or background, which makes the text extraction hard.所以问题是灰度图像内部的高文本像素颜色范围“干扰”了其他元素或背景的 colors,这使得文本提取变得困难。

Note: I already tried AdaptiveThresholding like MeanC and GaussianC with different treshholds, kernel sizes and mean substraction constants without good results.注意:我已经用不同的阈值、kernel 大小和平均减法常数尝试了像 MeanC 和 GaussianC 这样的 AdaptiveThresholding,但没有很好的结果。

Question: What would be a efficient solution for the preprocessing?问题:预处理的有效解决方案是什么?

I'm thinking about writing a method that binarizas from RGB, not from grayscaled.我正在考虑编写一种从 RGB 而非灰度进行二值化的方法。 So the method would take a RGB image as input and binarize that white text color range into black and everything else into white.因此,该方法将 RGB 图像作为输入,并将白色文本颜色范围二值化为黑色,将其他所有内容二值化为白色。

One approach is to remove any frequencies in the image lower than that of your text.一种方法是删除图像中低于文本频率的任何频率。 This can be done by creating a blurred copy of the image, with a kernel a bit larger than your text, and subtract this blurred image from the original.这可以通过创建图像的模糊副本来完成,其中 kernel 比您的文本大一点,然后从原始图像中减去这个模糊图像。 This should keep high frequencies, ie text and other edges, while removing any vignetting or other gradients over the image.这应该保持高频,即文本和其他边缘,同时消除图像上的任何渐晕或其他渐变。 Keep in mind that the resulting image will have a different range of values, where some will probably be negative.请记住,生成的图像将具有不同的值范围,其中一些可能是负值。

Another option would be to split the image into sections, and use different thresholds in each, but that may lead to artifacts at the section boundaries.另一种选择是将图像分割成多个部分,并在每个部分中使用不同的阈值,但这可能会导致部分边界处出现伪影。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM