使用 python 和互相关进行图像配准

Question

I got two images showing exaktly the same content: 2D-gaussian-shaped spots.我得到了两张显示完全相同内容的图像：二维高斯形斑点。 I call these two 16-bit png-files "left.png" and "right.png".我称这两个 16 位 png 文件为“left.png”和“right.png”。 But as they are obtained thru an slightly different optical setup, the corresponding spots (physically the same) appear at slightly different positions.但由于它们是通过稍微不同的光学设置获得的，因此相应的点（物理上相同）出现在稍微不同的位置。 Meaning the right is slightly stretched, distorted, or so, in a non-linear way.这意味着右侧以非线性方式略微拉伸，扭曲或如此。 Therefore I would like to get the transformation from left to right.因此，我想从左到右进行转换。

So for every pixel on the left side with its x- and y-coordinate I want a function giving me the components of the displacement-vector that points to the corresponding pixel on the right side.因此，对于左侧的每个像素及其 x 和 y 坐标，我想要一个函数，为我提供指向右侧相应像素的位移矢量的分量。

In a former approach I tried to get the positions of the corresponding spots to obtain the relative distances deltaX and deltaY.在前一种方法中，我试图获取相应点的位置以获得相对距离 deltaX 和 deltaY。 These distances then I fitted to the taylor-expansion up to second order of T(x,y) giving me the x- and y-component of the displacement vector for every pixel (x,y) on the left, pointing to corresponding pixel (x',y') on the right.然后我将这些距离拟合到 T(x,y) 的二阶的泰勒展开，给出左侧每个像素 (x,y) 的位移矢量的 x 和 y 分量，指向相应的像素(x',y') 在右边。

To get a more general result I would like to use normalized cross-correlation.为了获得更一般的结果，我想使用归一化互相关。 For this I multiply every pixelvalue from left with a corresponding pixelvalue from right and sum over these products.为此，我将左侧的每个像素值与右侧的相应像素值相乘，然后对这些乘积求和。 The transformation I am looking for should connect the pixels that will maximize the sum.我正在寻找的转换应该连接将使总和最大化的像素。 So when the sum is maximzied, I know that I multiplied the corresponding pixels.所以当总和最大化时，我知道我乘以相应的像素。

I really tried a lot with this, but didn't manage.我真的尝试了很多，但没有成功。 My question is if somebody of you has an idea or has ever done something similar.我的问题是你们中是否有人有想法或曾经做过类似的事情。

import numpy as np
import Image

left = np.array(Image.open('left.png'))
right = np.array(Image.open('right.png'))

# for normalization (http://en.wikipedia.org/wiki/Cross-correlation#Normalized_cross-correlation)    
left = (left - left.mean()) / left.std()
right = (right - right.mean()) / right.std()

Please let me know if I can make this question more clear.请让我知道我是否可以更清楚地说明这个问题。 I still have to check out how to post questions using latex.我仍然需要查看如何使用乳胶发布问题。

Thank you very much for input.非常感谢您的意见。

正确的

[left.png] http://i.stack.imgur.com/oSTER.png [right.png] http://i.stack.imgur.com/Njahj.png [左.png] http://i.stack.imgur.com/oSTER.png [右.png] http://i.stack.imgur.com/Njahj.png

I'm afraid, in most cases 16-bit images appear just black (at least on systems I use) :( but of course there is data in there.恐怕，在大多数情况下，16 位图像看起来只是黑色（至少在我使用的系统上）:( 但当然那里有数据。

UPDATE 1更新 1

I try to clearify my question.我试图澄清我的问题。 I am looking for a vector-field with displacement-vectors that point from every pixel in left.png to the corresponding pixel in right.png .我正在寻找一个具有位移向量的向量场，该向量场从 left.png 中的每个像素指向 right.png 中的相应像素。 My problem is, that I am not sure about the constraints I have.我的问题是，我不确定我有什么限制。

where vector r (components x and y) points to a pixel in left.png and vector r-prime (components x-prime and y-prime) points to the corresponding pixel in right.png.其中向量 r（分量 x 和 y）指向 left.png 中的一个像素，向量 r-prime（分量 x-prime 和 y-prime）指向 right.png 中的相应像素。 for every r there is a displacement-vector.对于每个 r，都有一个位移矢量。

What I did earlier was, that I found manually components of vector-field d and fitted them to a polynom second degree:我之前所做的是，我手动找到了向量场 d 的分量并将它们拟合到多项式二阶：

So I fitted:所以我装了：

and和

Does this make sense to you?你能理解这个吗？ Is it possible to get all the delta-x(x,y) and delta-y(x,y) with cross-correlation?是否可以通过互相关获得所有 delta-x(x,y) 和 delta-y(x,y)？ The cross-correlation should be maximized if the corresponding pixels are linked together thru the displacement-vectors, right?如果相应的像素通过位移矢量链接在一起，则应该最大化互相关，对吧？

UPDATE 2更新 2

So the algorithm I was thinking of is as follows:所以我想到的算法如下：

Deform right.png向右变形.png
Get the value of cross-correlation获取互相关的值
Deform right.png further进一步变形 right.png
Get the value of cross-correlation and compare to value before获取互相关的值并与之前的值进行比较
If it's greater, good deformation, if not, redo deformation and do something else如果较大，则变形良好，否则，重做变形并做其他事情
After maximzied the cross-correlation value, know what deformation there is :)最大化互相关值后，知道有什么变形:)

About deformation: could one do first a shift along x- and y-direction to maximize cross-correlation, then in a second step stretch or compress x- and y-dependant and in a third step deform quadratic x- and y-dependent and repeat this procedure iterativ??关于变形：是否可以首先沿 x 和 y 方向移动以最大化互相关，然后在第二步中拉伸或压缩 x 和 y 相关，并在第三步中变形二次 x 和 y 相关和重复这个过程迭代？？ I really have a problem to do this with integer-coordinates.我真的很难用整数坐标来做到这一点。 Do you think I would have to interpolate the picture to obtain a continuous distribution??你认为我必须对图片进行插值以获得连续分布吗？ I have to think about this again :( Thanks to everybody for taking part :)我必须再考虑一下：（感谢大家的参与：）

Answer 1

OpenCV (and with it the python Opencv binding) has a StarDetector class which implements this algorithm . OpenCV（以及 python Opencv 绑定）有一个实现此算法的StarDetector类。

As an alternative you might have a look at the OpenCV SIFT class, which stands for Scale Invariant Feature Transform.作为替代方案，您可以查看 OpenCV SIFT类，它代表 Scale Invariant Feature Transform。

Update更新

Regarding your comment, I understand that the "right" transformation will maximize the cross-correlation between the images, but I don't understand how you choose the set of transformations over which to maximize.关于您的评论，我了解“正确”转换将使图像之间的互相关最大化，但我不明白您如何选择要最大化的转换集。 Maybe if you know the coordinates of three matching points (either by some heuristics or by choosing them by hand), and if you expect affinity, you could use something like cv2.getAffineTransform to have a good initial transformation for your maximization process.也许如果您知道三个匹配点的坐标（通过一些启发式方法或手动选择它们），并且如果您期望亲和力，您可以使用类似cv2.getAffineTransform的东西来为您的最大化过程进行良好的初始转换。 From there you could use small additional transformations to have a set over which to maximize.从那里您可以使用小的附加转换来获得一个最大化的集合。 But this approach seems to me like re-inventing something which SIFT could take care of.但在我看来，这种方法就像是重新发明了 SIFT 可以处理的东西。

To actually transform your test image you can use cv2.warpAffine , which also can take care of border values (eg pad with 0).要实际转换您的测试图像，您可以使用cv2.warpAffine ，它还可以处理边界值（例如用 0 填充）。 To calculate the cross-correlation you could use scipy.signal.correlate2d .要计算互相关，您可以使用scipy.signal.correlate2d 。

Update更新

Your latest update did indeed clarify some points for me.您的最新更新确实为我澄清了一些观点。 But I think that a vector field of displacements is not the most natural thing to look for, and this is also where the misunderstanding came from.但我认为位移向量场并不是最自然的东西，这也是误解的来源。 I was thinking more along the lines of a global transformation T, which applied to any point (x,y) of the left image gives (x',y')=T(x,y) on the right side, but T has the same analytical form for every pixel.我更多地考虑全局变换 T，它应用于左侧图像的任何点 (x,y)，在右侧给出 (x',y')=T(x,y)，但 T 有每个像素的分析形式相同。 For example, this could be a combination of a displacement, rotation, scaling, maybe some perspective transformation.例如，这可能是位移、旋转、缩放的组合，也可能是一些透视变换。 I cannot say whether it is realistic or not to hope to find such a transformation, this depends on your setup, but if the scene is physically the same on both sides I would say it is reasonable to expect some affine transformation.我不能说希望找到这样的变换是否现实，这取决于您的设置，但如果场景在物理上双方相同，我会说期待一些仿射变换是合理的。 This is why I suggested cv2.getAffineTransform .这就是我建议cv2.getAffineTransform的原因。 It is of course trivial to calculate your displacement Vector field from such a T, as this is just T(x,y)-(x,y).从这样的 T 计算位移向量场当然是微不足道的，因为这只是 T(x,y)-(x,y)。

The big advantage would be that you have only very few degrees of freedom for your transformation, instead of, I would argue, 2N degrees of freedom in the displacement vector field, where N is the number of bright spots.最大的优势是您的变换只有很少的自由度，而不是我认为位移矢量场中的 2N 自由度，其中 N 是亮点的数量。

If it is indeed an affine transformation, I would suggest some algorithm like this:如果它确实是一个仿射变换，我会建议一些这样的算法：

identify three bright and well isolated spots on the left识别左侧的三个明亮且隔离良好的点
for each of these three spots, define a bounding box so that you can hope to identify the corresponding spot within it in the right image对于这三个点中的每一个，定义一个边界框，以便您可以希望在正确的图像中识别其中的对应点
find the coordinates of the corresponding spots, eg with some correlation method as implemented in cv2.matchTemplate or by also just finding the brightest spot within the bounding box.找到对应点的坐标，例如使用cv2.matchTemplate中实现的一些相关方法，或者也可以只找到边界框内的最亮点。
once you have three matching pairs of coordinates, calculate the affine transformation which transforms one set into the other with cv2.getAffineTransform .一旦你有三对匹配的坐标，计算仿射变换，它使用cv2.getAffineTransform将一组转换为另一组。
apply this affine transformation to the left image, as a check if you found the right one you could calculate if the overall normalized cross-correlation is above some threshold or drops significantly if you displace one image with respect to the other.将此仿射变换应用于左侧图像，以检查您是否找到正确的图像，您可以计算整体归一化互相关是否高于某个阈值，或者如果您将一个图像相对于另一个图像置换则显着下降。
if you wish and still need it, calculate the displacement vector field trivially from your transformation T.如果您愿意并且仍然需要它，请从您的变换 T 中简单地计算位移矢量场。

Update更新

It seems cv2.getAffineTransform expects an awkward input data type 'float32'.似乎cv2.getAffineTransform需要一个尴尬的输入数据类型“float32”。 Let's assume the source coordinates are (sxi,syi) and destination (dxi,dyi) with i=0,1,2 , then what you need is假设源坐标是(sxi,syi)和目标(dxi,dyi)且i=0,1,2 ，那么您需要的是

src = np.array( ((sx0,sy0),(sx1,sy1),(sx2,sy2)), dtype='float32' )
dst = np.array( ((dx0,dy0),(dx1,dy1),(dx2,dy2)), dtype='float32' )

result = cv2.getAffineTransform(src,dst)

Answer 2

I don't think a cross correlation is going to help here, as it only gives you a single best shift for the whole image.我不认为互相关在这里会有所帮助，因为它只会为您提供整个图像的单一最佳转变。 There are three alternatives I would consider:我会考虑三种选择：

Do a cross correlation on sub-clusters of dots.对点的子集群进行互相关。 Take, for example, the three dots in the top right and find the optimal xy shift through cross-correlation.以右上角的三个点为例，通过互相关找到最佳的 xy 偏移。 This gives you the rough transform for the top left.这为您提供了左上角的粗略变换。 Repeat for as many clusters as you can to obtain a reasonable map of your transformations.重复尽可能多的集群以获得合理的转换图。 Fit this with your Taylor expansion and you might get reasonably close.将此与您的泰勒展开式相匹配，您可能会相当接近。 However, to have your cross-correlation work in any way, the difference in displacement between spots must be less than the extend of the spot, else you can never get all spots in a cluster to overlap simultaneously with a single displacement.但是，要使您的互相关以任何方式起作用，点之间的位移差异必须小于点的延伸，否则您永远无法使群集中的所有点同时与单个位移重叠。 Under these conditions, option 2 might be more suitable.在这些条件下，选项 2 可能更合适。
If the displacements are relatively small (which I think is a condition for option 1), then we might assume that for a given spot in the left image, the closest spot in the right image is the corresponding spot.如果位移相对较小（我认为这是选项 1 的条件），那么我们可以假设对于左侧图像中的给定点，右侧图像中最近的点是相应的点。 Thus, for every spot in the left image, we find the nearest spot in the right image and use that as the displacement in that location.因此，对于左图中的每个点，我们在右图中找到最近的点，并将其用作该位置的位移。 From the 40-something well distributed displacement vectors we can obtain a reasonable approximation of the actual displacement by fitting your Taylor expansion.从 40 多个分布良好的位移矢量中，我们可以通过拟合泰勒展开来获得实际位移的合理近似值。
This is probably the slowest method, but might be the most robust if you have large displacements (and option 2 thus doesn't work): use something like an evolutionary algorithm to find the displacement.这可能是最慢的方法，但如果您有较大的位移（因此选项 2 不起作用），这可能是最稳健的方法：使用类似进化算法的方法来找到位移。 Apply a random transformation, compute the remaining error (you might need to define this as sum of the smallest distance between spots in your original and transformed image), and improve your transformation with those results.应用随机变换，计算剩余误差（您可能需要将其定义为原始图像和变换图像中点之间的最小距离之和），并使用这些结果改进变换。 If your displacements are rather large you might need a very broad search as you'll probably get lots of local minima in your landscape.如果您的位移相当大，您可能需要进行非常广泛的搜索，因为您可能会在您的景观中获得很多局部最小值。

I would try option 2 as it seems your displacements might be small enough to easily associate a spot in the left image with a spot in the right image.我会尝试选项 2，因为您的位移可能足够小，可以轻松地将左侧图像中的点与右侧图像中的点关联起来。

Update更新

I assume your optics induce non linear distortions and having two separate beampaths (different filters in each?) will make the relationship between the two images even more non-linear.我假设您的光学器件会引起非线性失真，并且具有两个单独的光束路径（每个光束路径不同？）将使两个图像之间的关系更加非线性。 The affine transformation PiQuer suggests might give a reasonable approach but can probably never completely cover the actual distortions. PiQuer 建议的仿射变换可能会提供一种合理的方法，但可能永远无法完全覆盖实际的失真。

I think your approach of fitting to a low order Taylor polynomial is fine.我认为您拟合低阶泰勒多项式的方法很好。 This works for all my applications with similar conditions.这适用于我所有具有类似条件的应用程序。 Highest orders probably should be something like xy^2 and x^2y;最高阶可能应该是 xy^2 和 x^2y 之类的东西； anything higher than that you won't notice.任何高于你不会注意到的东西。

Alternatively, you might be able to calibrate the distortions for each image first, and then do your experiments.或者，您可以先校准每张图像的失真，然后再进行实验。 This way you are not dependent on the distribution of you dots, but can use a high resolution reference image to get the best description of your transformation.通过这种方式，您不依赖于点的分布，而是可以使用高分辨率参考图像来获得对变换的最佳描述。

Option 2 above still stands as my suggestion for getting the two images to overlap.上面的选项 2 仍然是我让两个图像重叠的建议。 This can be fully automated and I'm not sure what you mean when you want a more general result.这可以完全自动化，当您想要更一般的结果时，我不确定您的意思。

Update 2更新 2

You comment that you have trouble matching dots in the two images.您评论说您无法匹配两个图像中的点。 If this is the case, I think your iterative cross-correlation approach may not be very robust either.如果是这种情况，我认为您的迭代互相关方法可能也不是很稳健。 You have very small dots, so overlap between them will only occur if the difference between the two images is small.你有非常小的点，所以它们之间的重叠只会在两个图像之间的差异很小的情况下发生。

In principle there is nothing wrong with your proposed solution, but whether it works or not strongly depends on the size of your deformations and the robustness of your optimization algorithm.原则上，您提出的解决方案没有任何问题，但它是否有效很大程度上取决于变形的大小和优化算法的稳健性。 If you start off with very little overlap, then it may be hard to find a good starting point for your optimization.如果您从非常少的重叠开始，那么可能很难为您的优化找到一个好的起点。 Yet if you have sufficient overlap to begin with, then you should have been able to find the deformation per dot first, but in a comment you indicate that this doesn't work.然而，如果你有足够的重叠开始，那么你应该能够首先找到每个点的变形，但在评论中你指出这不起作用。

Perhaps you can go for a mixed solution: find the cross correlation of clusters of dots to get a starting point for your optimization, and then tweak the deformation using something like the procedure you describe in your update.也许您可以寻求混合解决方案：找到点簇的互相关以获得优化的起点，然后使用类似于您在更新中描述的过程调整变形。 Thus:因此：

For a NxN pixel segment find the shift between the left and right images对于 NxN 像素段，找到左右图像之间的偏移
Repeat for, say, 16 of those segments重复这些片段中的 16 个
Compute an approximation of the deformation using those 16 points使用这 16 个点计算变形的近似值
Use this as the starting point of your optimization approach将此作为优化方法的起点

Answer 3

You might want to have a look at bunwarpj which already does what you're trying to do.你可能想看看bunwarpj已经做了你想做的事情。 It's not python but I use it in exactly this context.它不是 python，但我正是在这种情况下使用它。 You can export a plain text spline transformation and use it if you wish to do so.如果您愿意，可以导出纯文本样条变换并使用它。

使用 python 和互相关进行图像配准

问题描述

UPDATE 1更新 1

UPDATE 2更新 2

3 个解决方案

解决方案1
2 2012-02-13 12:17:36

解决方案2
1 2012-02-14 08:46:37

Update更新

Update 2更新 2

解决方案3
0 2013-05-16 10:10:38

使用 python 和互相关进行图像配准

问题描述

UPDATE 1更新 1

UPDATE 2更新 2

3 个解决方案

解决方案1 2 2012-02-13 12:17:36

解决方案2 1 2012-02-14 08:46:37

Update更新

Update 2更新 2

解决方案3 0 2013-05-16 10:10:38

解决方案1
2 2012-02-13 12:17:36

解决方案2
1 2012-02-14 08:46:37

解决方案3
0 2013-05-16 10:10:38