简体   繁体   English

如何以编程方式读取扫描的文档或图像

[英]How to programmatically read over a scanned document or image

I've searched around on the net, as I'm a bit of a n00b when it comes to OCR, and I'm actually not sure where a good starting point would be. 我在网上搜索过,因为在OCR方面我有点像n00b,而我实际上并不确定一个好的起点在哪里。

I'd like to build an app that will be able to identify & count say for example how many check boxes are filled in on any given row of document/image (it could even be another format should anyone know of something that would better suite an application of this type). 我想构建一个能够识别和统计的应用程序,例如说明在任何给定的文档/图像行上填写了多少个复选框(如果有人知道更好的套件,它甚至可能是另一种格式这种类型的应用)。 the ultimate goal being to eliminate manual data capturing and speed up the process of getting the overall statistics to the end user of the application 最终目标是消除手动数据捕获并加快将整体统计数据提供给应用程序最终用户的过程

I code in c# primarily, so a .net solution would be preferable, but if not I'll take what i can get. 我主要以c#编码,所以.net解决方案会更好,但如果不是,我会采取我能得到的。

What i had in mind was to redesign the forms the users fill in to something similar to this. 我的想法是重新设计用户填写的表格,使其与此类似。 (excuse the crude ASCII art :P) so the person filling in the form only has to check a value on the paper. (请原谅粗略的ASCII艺术:P)所以填写表格的人只需检查纸上的值。

                |  1  |  2  |  3  |  4  |  5  |  
Product A       | [ ]   [ ]   [ ]   [ ]   [x] |    
Product B       | [ ]   [ ]   [x]   [ ]   [ ] |

any ideas would be greatly appreciated 任何想法将不胜感激

Thank you! 谢谢!

1) You could also check the free, but very capable Tesseract OCR engine. 1)您还可以检查免费但功能强大的Tesseract OCR引擎。 It is written in C++, but you could probably use C# to easily interface to it. 它是用C ++编写的,但您可以使用C#轻松地与它进行交互。

2) If you would like to roll your own with image processing, you could look at using the EmguCV library, which is the .NET wrapper for OpenCV . 2)如果您想使用图像处理自己动手,可以使用EmguCV库,它是OpenCV的.NET包装器。

There was a recent post on the opencv-tag, which was trying to solve a very similar problem to yours that involved detecting marks on a lotto card. 最近有一篇关于opencv-tag的帖子 ,它试图解决一个非常类似的问题,涉及检测乐透卡上的标记。

You can try and use the Office MODI library . 您可以尝试使用Office MODI库

Other options are 其他选择是

  1. a commercial OCR library, or 商业OCR库,或
  2. implement your own bitmap recognition logic (might be feasible if you have full control over the layout of what has to be scanned). 实现自己的位图识别逻辑(如果您可以完全控制必须扫描的布局,则可能是可行的)。

If all you're doing is looking for X's in boxes, then you could print the form in light blue and ask people to mark the boxes with a black ink pen. 如果你正在做的只是在盒子里寻找X,那么你可以用浅蓝色打印表格并要求人们用黑色墨水笔标记盒子。

You just scan the image and look for the black X pixels. 您只需扫描图像并查找黑色X像素。 They should be relatively easy to find, compared to the light blue form. 与浅蓝色相比,它们应该相对容易找到。 Particular x, y, coordinates on the scanned image would correspond with the answer and product type, respectively. 扫描图像上的特定x,y坐标将分别对应于答案和产品类型。

What you need is Optical Mark Recognition (OMR). 您需要的是光学标记识别(OMR)。 If you are planning a commercial software, have a look at ABBYY FlexiCapture Engine , it's an SDK for integrating data and document capture technologies in server, desktop and mobile applications. 如果您正在计划商业软件,请查看ABBYY FlexiCapture Engine ,它是一个用于在服务器,桌面和移动应用程序中集成数据和文档捕获技术的SDK。 It's not free, but when it comes to business - it can add a serious value to your product. 它不是免费的,但是当涉及到业务时 - 它可以为您的产品增加一个重要的价值。

You could also use a cloud service - a website that let you upload an image and send you back an OCR'ed data. 您还可以使用云服务 - 一个允许您上传图像并向您发回OCR数据的网站。 Try www.ocrsdk.com , it is a cloud based OCR SDK recently launched by ABBYY. 试试www.ocrsdk.com ,它是ABBYY最近推出的基于云的OCR SDK。 It's now in beta so it's completely free to use. 它现在处于测试阶段,因此完全免费使用。 It requires for the end user device to have an internet connection, but it's completely indepent from your programming language choice and user's device resources. 它要求最终用户设备具有互联网连接,但它完全独立于您的编程语言选择和用户的设备资源。 There are both .NET and Java code samples avalaibe at github. 在github上有avalaibe的.NETJava代码示例。

Disclamer: i work @ ABBYY. 免责声明:我在@ ABBYY工作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM