简体   繁体   English

具有tesseract或OpenCV的android的对象检测

[英]Object Detection for android with tesseract or OpenCV

I have successfully integrated tesseract into my android app and it reads whatever the image that I capture but with very less accuracy. 我已成功将tesseract集成到我的Android应用程序中,它会读取我捕获的任何图像,但准确度却非常低。 But most of the time I do not get the correct text after capturing because some text around the region of interest is also getting captured. 但大部分时间我都没有在捕获后获得正确的文本,因为感兴趣区域周围的一些文本也被捕获。

All I want to read is all text from a rectangular area, accurately, without capturing the edges of the rectangle. 所有我想要阅读的都是来自矩形区域的所有文本,准确无需捕获矩形的边缘。 I have done some research and posted on stackoverflow about this two times, but still did not get a happy result! 我做了一些研究并在stackoverflow上发布了这两次,但仍然没有得到满意的结果!

Following are the 2 posts that I made: 以下是我发的2篇帖子:

https://stackoverflow.com/questions/16663504/extract-text-from-a-captured-image?noredirect=1#comment23973954_16663504 https://stackoverflow.com/questions/16663504/extract-text-from-a-captured-image?noredirect=1#comment23973954_16663504

Extracting information from captured image in android 从android中捕获的图像中提取信息

I am not sure whether to go ahead with tesseract or use openCV 我不确定是继续使用tesseract还是使用openCV

Including the many links and answers from others, I think it's good to take a step back and note that there are actually two fundamental steps to optical character recognition (OCR): 包括许多链接和其他人的答案,我认为退后一步并注意光学字符识别(OCR)实际上有两个基本步骤是很好的:

  • Text Detection: This is the title and focus of your question, and it is concerned with localizing regions in an image that contain text. 文本检测:这是您的问题的标题和焦点,它涉及在包含文本的图像中本地化区域。
  • Text Recognition: This is where the actual recognition happens, where the localized image regions from detection get segmented character-by-character and classified. 文本识别:这是实际识别发生的地方,其中来自检测的局部图像区域逐个字符地分割并分类。 This is also where tools like Tesseract come into play. 这也是Tesseract等工具发挥作用的地方。

Now, there are also two general settings in which OCR is applied: 现在,还有两种应用OCR的常规设置:

  • Controlled: These are images taken from a scanner or similar in-nature where the target is a document and things like perspective, scale, font, orientation, background consistency, etc are pretty docile. 受控制:这些是从扫描仪或类似物本身获取的图像,其中目标是文档,诸如透视,缩放,字体,方向,背景一致性等内容非常温顺。
  • Uncontrolled/Scene: These are the more natural and in-the-wild photos, eg those taken from a camera, where you are trying to recognize a street sign, shop name, etc. 不受控制/场景:这些是更自然和野外的照片,例如从相机拍摄的照片,您正在尝试识别街道标志,商店名称等。

Tesseract as-is is most applicable to the "controlled" setting. Tesseract as-is最适用于“受控”设置。 And in general, but for scene OCR especially, "re-training" Tesseract will not directly improve detection, but may improve recognition. 而且一般来说,但对于场景OCR而言,“重新训练”Tesseract 不会直接改善检测,但可能会提高识别率。

If you are looking to improve scene text detection, see this work ; 如果您希望改进场景文本检测,请参阅此工作 ; and if you are looking at improving scene text recognition, see this work . 如果您正在考虑改进场景文本识别,请参阅此工作 Since you asked about detection, the detection reference uses maximally stable extremal regions (MSER), which has a plethora of implementation resources, eg see here . 由于您询问了检测,检测参考使用最大稳定的极值区域(MSER),其具有过多的实现资源,例如,请参见此处

There's also a text detection project here specifically for Android too: 此处还有专门针对Android的文本检测项目:
https://github.com/dreamdragon/text-detection https://github.com/dreamdragon/text-detection

As many have noted, keep in mind that recognition is still an open research challenge. 正如许多人所指出的那样,请记住,承认仍然是一个开放的研究挑战。

The solution to improving the OCR output is to 提高OCR输出的解决方案是

  • either use more training data to train it better 要么使用更多的训练数据来更好地训练它

  • filter it's input using some Linear Filter (grayscaling, high-contrasting, blurring) 使用一些线性滤波器(灰度,高对比度,模糊)过滤它的输入

In the chat we posted a number of links describing filtering techniques used in OCRing, but sample code wasn't posted. 在聊天中,我们发布了许多描述OCRing中使用的过滤技术的链接,但未发布示例代码。

Some of the links posted were 发布的一些链接是

Improving input for OCR 改善OCR的输入

How to train Tesseract 如何训练Tesseract

Text enhancement using asymmetric filters <-- this paper is easily found on google, and should be read fully as it quite clearly illustrates and demonstrates necessary steps before OCR-processing the image. 使用非对称过滤器进行文本增强 < - 本文很容易在谷歌上找到,应该完整阅读,因为它非常清楚地说明了OCR处理图像之前的必要步骤。

OCR Classification OCR分类

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM