简体繁体 English

模板匹配-图像减法

[英]Template matching - Image subtraction

原文 2010-10-01 08:52:06 1 2 algorithm/ image-processing/ opencv/ image-manipulation/ subtraction

I have a project where I am required to subtract an empty template image from an incoming user filled image. 我有一个项目，需要从传入的用户填充图像中减去空模板图像。 The document type is a normal Bank cheque. 单据类型是普通的银行支票。

The aim is to extract the handwritten fields from it by subtracting one image from the empty template image. 目的是通过从空模板图像中减去一个图像来从中提取手写字段。

The issue what i am facing is in aligning these two images, as there is scaling, translation, rotation etc 我面临的问题是对齐这两个图像，因为存在缩放，平移，旋转等

Any ideas on how to align the template image with the incoming image? 关于如何将模板图像与传入图像对齐的任何想法？

UPDATE 1: 更新1：

I am posting an example image from the wikipedia page but in the monochrome format as my image is in monochrome format. 我正在从Wikipedia页面发布示例图像，但是以单色格式发布，因为我的图像是单色格式。 替代文字

2 个解决方案

When working with Image processing for industrial projects we have in most of the cases a fiducial. 在工业项目中使用图像处理时，在大多数情况下，我们都有基准。 A fiducial is like a mark - can be a hole, an cross mark - that never changes, is always in the same positions. 基准就像一个标记-可以是一个孔，一个十字标记-永远不变，始终处于相同的位置。

Generally two fiducials are enough to correct misaligning problems like rotation, translation and also scale. 通常，两个基准足以纠正未对准的问题，例如旋转，平移以及缩放。 For instance If you know the distance between the two, you can always check it to make sure the scale factor is right, or correct it based on the difference of the current distance against the right distance. 例如，如果您知道两者之间的距离，则可以始终对其进行检查以确保比例因子正确，或者根据当前距离与正确距离之间的差进行校正。

In your case, what I would ask you is: Does the template and the incoming image share any visual sign that are invariant and can easily be segmented? 在您的情况下，我要问的是：模板和传入的图像是否共享不变的且易于分割的视觉符号？

If you have the answer for that question, all the rest will be more simple - the difference itself is a quite straightforward algorithm. 如果您有该问题的答案，那么其余所有内容都会更简单-区别本身就是一个非常简单的算法。

The basic answer is write a function that takes two images and a 2D transform and tells you how aligned they are once you apply the transform to the target image. 基本的答案是编写一个函数，该函数获取两个图像和一个2D变换，并告诉您将变换应用于目标图像后它们的对齐方式。 The function needs to be continuous based on the transform and have a local minima (0) where the images are aligned perfectly. 该函数必须基于变换是连续的，并且具有图像完美对齐的局部最小值（0）。 This is called a cost function. 这称为成本函数。

Then use any optimization algorithm over the function and inputs -- you are trying to optimize the transform (translation, scale, rotation). 然后对函数和输入使用任何优化算法-您正在尝试优化转换（平移，缩放，旋转）。 Examples are hill climbing, genetic, simulated annealing, etc. 例如爬山，遗传，模拟退火等。

There are products that do this -- usually they are called Forms Recognition, Forms Registration, Forms Processing, etc. Some are SDKs, but there are also applications that can do it without programming. 有一些产品可以执行此操作-通常将它们称为“表单识别”，“表单注册”，“表单处理”等。有些是SDK，但也有一些应用程序无需编程即可执行此操作。

Disclaimer: I work at Atalasoft, where we sell a Forms Processing add-on to our .NET imaging SDK. 免责声明：我在Atalasoft工作，我们在其中向.NET Imaging SDK销售Forms Processing附件。