简体   繁体   English

在大型(PIL + OpenCV)中查找小型图像

[英]Finding small image inside large (PIL + OpenCV)

I'm trying to do as described here: Finding a subimage inside a Numpy image to be able to search an image inside screenshot. 我正在尝试按照此处所述进行操作: 在Numpy图像中查找子图像 ,以便能够在屏幕截图中搜索图像。

The code looks like that: 代码如下所示:

import cv2
import numpy as np
import gtk.gdk
from PIL import Image

def make_screenshot():
    w = gtk.gdk.get_default_root_window()
    sz = w.get_size()
    pb = gtk.gdk.Pixbuf(gtk.gdk.COLORSPACE_RGB, False, 8, sz[0], sz[1])
    pb = pb.get_from_drawable(w, w.get_colormap(), 0, 0, 0, 0, sz[0], sz[1])
    width, height = pb.get_width(), pb.get_height()
    return Image.fromstring("RGB", (width, height), pb.get_pixels())

if __name__ == "__main__":
    img = make_screenshot()
    cv_im = cv2.cvtColor(np.array(img), cv2.COLOR_RGB2BGR)
    template = cv_im[30:40, 30:40, :]
    result = cv2.matchTemplate(cv_im, template, cv2.TM_CCORR_NORMED)
    print np.unravel_index(result.argmax(), result.shape)

Depending on method selected (instead of cv2.TM_CCORR_NORMED) I'm getting completely different coordinates, but none of them is (30, 30) as in example. 根据选择的方法(而不是cv2.TM_CCORR_NORMED),我将获得完全不同的坐标,但是例如,它们都不是(30,30)。

Please, teach me, what's wrong with such approach? 请教我,这种方法有什么问题?

Short answer: you need to use the following line to locate the corner of the best match: 简短的答案:您需要使用以下行找到最匹配的角:

minVal, maxVal, minLoc, maxLoc = cv2.minMaxLoc(result)

The variable maxLoc will hold a tuple containing the x, y indices of the upper lefthand corner of the best match. 变量maxLoc将保存一个元组,其中包含最佳匹配的左上角的x,y索引。

Long answer: 长答案:

cv2.matchTemplate() returns a single channel image where the number at each index corresponds to how well the input image matched the template at that index. cv2.matchTemplate()返回一个通道图像,其中每个索引处的数字对应于输入图像与该索引处的模板的匹配程度。 Try visualizing result by inserting the following lines of code after your call to matchTemplate, and you will see why numpy would have a difficult time making sense of it. 通过在调用matchTemplate之后插入以下代码行来尝试可视化结果,您将了解为什么numpy很难理解它。

cv2.imshow("Debugging Window", result)
cv2.waitKey(0)
cv2.destroyAllWindows()

minMaxLoc() turns the result returned by matchTemplate into the information you want. minMaxLoc()将matchTemplate返回的结果转换为所需的信息。 If you cared to know where the template had the worst match, or what value was held by result at the best and worst matches, you could use those values too. 如果您想知道模板的最差匹配位置,或最佳和最差匹配结果在结果中保留的值,则也可以使用这些值。

This code worked for me on an example image that I read from file. 该代码对我从文件读取的示例图像有效。 If your code continues to misbehave, you probably aren't reading in your images the way you want to. 如果您的代码继续出现异常,则可能是您未按照所需的方式读取图像。 The above snippet of code is useful for debugging with OpenCV. 上面的代码片段对于使用OpenCV进行调试非常有用。 Replace the argument result in imshow with the name of any image object (numpy array) to visually confirm that you are getting the image you want. 将imshow中的参数结果替换为任何图像对象(numpy数组)的名称,以直观地确认您正在获取所需的图像。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM