简体   繁体   English

OpenCV:边缘检测图像中的字母和单词检测

[英]OpenCV: Letters and words detection from edge detection image

I am currently dealing with text recognition. 我目前正在处理文本识别。 Here is a part of binarized image with edge detection (using Canny): 这是带有边缘检测的二值化图像的一部分(使用Canny):

EDIT: I am posting a link to an image. 编辑:我发布到图像的链接。 I don't have 10 rep points so I cannot post an image. 我没有10个代表点,因此无法发布图片。

EDIT 2: And here's the same piece after thresholding. 编辑2:和这是阈值后的同一块。 Honestly, I don't know which approach would be better. 老实说,我不知道哪种方法更好。

[ [ 2 2

The questions remain the same: 问题保持不变:

  1. How should I detect certain letters? 我应该如何检测某些字母? I need to determine location of every letter and then every word. 我需要确定每个字母然后每个单词的位置。

  2. Is it a problem that some letters are "opened"? 某些字母被“打开”是否有问题? I mean that they are not closed areas. 我的意思是它们不是封闭区域。

  3. If I use cv::matchtemplate , does it mean that I need to have 24 templates for every letter + 10 for every digit? 如果我使用cv::matchtemplate ,是否意味着我需要每个字母24个模板,每个数字10个模板? And then loop over my image to determine the best correlation? 然后遍历我的图像以确定最佳相关性?

  4. If both the letters and squares they are in, are 1-pixel wide, what filters / operations should I do to close the opened letters? 如果它们所在的字母和正方形均为1像素宽,我应该执行哪些过滤器/操作来关闭打开的字母? I tried various combinations of dilate and erode - with no effect. 我尝试了扩张和腐蚀的各种组合-没有效果。

The question is kind of "how do I do OCR with Open CV?" 问题是“如何使用Open CV进行OCR?” and the answer is that it's an involved process and quite difficult. 答案是,这是一个复杂的过程,非常困难。

But some pointers. 但是有一些提示。 Firstly, its hard to detect letters which are outlined. 首先,很难检测到所概述的字母。 Most of the tools are designed for filled letters. 大多数工具都是为填充字母而设计的。 But that image looks as if there will only be one non-letter distractor if you fill all loops using a certain size threshold. 但是,如果您使用一定的大小阈值填充所有循环,则该图像看起来好像只有一个非字母干扰项。 You can get rid of the non-letter lines because they are a huge connected object. 您可以摆脱非字母行,因为它们是一个巨大的连接对象。

Once you've filled the letters, they can be skeletonised. 一旦您填写了信件,就可以将它们简化。

You can't use morphological operations like open and close very sensibly on images where the details are one pixel wide. 在细节为一像素宽的图像上,不能非常明智地使用形态学操作,例如打开和关闭。 You can put the image through the operation, but essentially there is no distinction between detail and noise if all features are one pixel. 您可以对图像进行操作,但是如果所有特征均为一个像素,则在细节和噪点之间基本上没有区别。 However once you fill the letters, that problem goes away. 但是,一旦您填写字母,该问题就消失了。

This isn't in any way telling you how to do it, just giving some pointers. 但这并没有告诉您如何做,只是给出了一些指示。

As mentioned in the previous answer by malcolm OCR will work better on filled letters so you can do the following 就像马尔科姆(Marcolm)先前的回答中提到的那样,OCR在填充字母上会更好

1 use your second approach but take the inverse result and not the one you are showing. 1使用第二种方法,但取相反的结果,而不是所显示的结果。 2 run connected component labeling 3 for each component you can run the OCR algorithm 2运行连接的组件标签3每个组件都可以运行OCR算法

In order to discard outliers I will try to use the spatial relation between detected letters. 为了丢弃异常值,我将尝试使用检测到的字母之间的空间关系。 They sold have other letter horizontally or vertically next to them. 他们出售的产品在其旁边水平或垂直都有其他字母。

Good luck 祝好运

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM