简体   繁体   English

使用opencv在图像中查找形状

[英]Finding shapes in an image using opencv

I'm trying to look for shapes in an image using OpenCV. 我正在尝试使用OpenCV在图像中查找形状。 I know the shapes I want to match (there are some shapes I don't know about, but I don't need to find them) and their orientations. 我知道我想要匹配的形状(有一些我不知道的形状,但我不需要找到它们)和它们的方向。 I don't know their sizes (scale) and locations. 我不知道他们的尺寸(规模)和位置。

My current approach: 我目前的做法:

  1. Detect contours 检测轮廓
  2. For each contour, calculate the maximum bounding box 对于每个轮廓,计算最大边界框
  3. Match each bounding box to one of the known shapes separately. 将每个边界框分别与一个已知形状匹配。 In my real project, I'm scaling the region to the template size and calculating differences in Sobel gradient, but for this demo, I'm just using the aspect ratio. 在我的实际项目中,我将区域缩放到模板大小并计算Sobel梯度的差异,但对于此演示,我只是使用宽高比。

Where this approach comes undone is where shapes touch. 如果这种方法没有取消,那就是形状触及的地方。 The contour detection picks up the two adjacent shapes as a single contour (single bounding box). 轮廓检测将两个相邻的形状作为单个轮廓(单个边界框)拾取。 The matching step will then obviously fail. 匹配步骤显然会失败。

Is there a way to modify my approach to handle adjacent shapes separately? 有没有办法修改我的方法来分别处理相邻的形状? Also, is there a better way to perform step 3? 此外,还有更好的方法来执行第3步吗?

For example: (Es colored green, Ys colored blue) 例如:(Es为绿色,Y为蓝色)

在此输入图像描述

Failed case: (unknown shape in red) 失败的情况:(未知的形状为红色)

在此输入图像描述

Source code: 源代码:

import cv
import sys
E = cv.LoadImage('e.png')
E_ratio = float(E.width)/E.height
Y = cv.LoadImage('y.png')
Y_ratio = float(Y.width)/Y.height
EPSILON = 0.1

im = cv.LoadImage(sys.argv[1], cv.CV_LOAD_IMAGE_GRAYSCALE)
storage = cv.CreateMemStorage(0)
seq = cv.FindContours(im, storage, cv.CV_RETR_EXTERNAL, 
        cv.CV_CHAIN_APPROX_SIMPLE)
regions = []
while seq:
    pts = [ pt for pt in seq ]
    x, y = zip(*pts)    
    min_x, min_y = min(x), min(y)
    width, height = max(x) - min_x + 1, max(y) - min_y + 1
    regions.append((min_x, min_y, width, height))
    seq = seq.h_next()

rgb = cv.LoadImage(sys.argv[1], cv.CV_LOAD_IMAGE_COLOR)
for x,y,width,height in regions:
    pt1 = x,y
    pt2 = x+width,y+height
    if abs(float(width)/height - E_ratio) < EPSILON:
        color = (0,255,0,0)
    elif abs(float(width)/height - Y_ratio) < EPSILON:
        color = (255,0,0,0)
    else:
        color = (0,0,255,0)
    cv.Rectangle(rgb, pt1, pt2, color, 2)

cv.ShowImage('rgb', rgb)
cv.WaitKey(0)

e.png: e.png:

在此输入图像描述

y.png: y.png:

在此输入图像描述

good: 好:

在此输入图像描述

bad: 坏:

在此输入图像描述

Before anybody asks, no, I'm not trying to break a captcha :) OCR per se isn't really relevant here: the actual shapes in my real project aren't characters -- I'm just lazy, and characters are the easiest thing to draw (and still get detected by trivial methods). 在任何人问之前,不,我打算破解验证码:) OCR本身并不真正相关:我真实项目中的实际形状不是字符 - 我只是懒惰,字符是最容易绘制的东西(仍然可以通过简单的方法检测到)。

As your shapes can vary in size and ratio, you should look at scaling invariant descriptors. 由于您的形状可以在大小和比例上有所不同,您应该查看缩放不变描述符。 A bunch of such descriptors would be perfect for your application. 一堆这样的描述符对于您的应用程序来说是完美的。

Process those descriptors on your test template and then use some kind of simple classification to extract them. 在测试模板上处理这些描述符,然后使用某种简单的分类来提取它们。 It should give pretty good results with simple shapes as you show. 当你展示时,它应该用简单的形状给出相当好的结果。

I used Zernike and Hu moments in the past, the latter being the most famous. 我过去曾使用Zernike和Hu的时刻,后者是最着名的。 You can find an example of implementation here : http://www.lengrand.fr/2011/11/classification-hu-and-zernike-moments-matlab/ . 您可以在此处找到实施示例: http//www.lengrand.fr/2011/11/classification-hu-and-zernike-moments-matlab/

Another thing : Given your problem, you should look at OCR technologies (stands for optical character recognition : http://en.wikipedia.org/wiki/Optical_character_recognition ;)). 另一件事:鉴于你的问题,你应该看看OCR技术(代表光学字符识别: http//en.wikipedia.org/wiki/Optical_character_recognition ;))。

Hope this helps a bit. 希望这个对你有帮助。

Julien 朱利安

Have you try Chamfer Matching or contour matching (correspondence) using CCH as descriptor. 您是否尝试使用CCH作为描述符进行倒角匹配或轮廓匹配(对应)。

Chamfer matching is using distance transform of target image and template contour. 倒角匹配使用目标图像和模板轮廓的距离变换。 not exactly scale invariant but fast. 不完全是规模不变但快速。

The latter is rather slow, as the complexity is at least quadratic for bipartite matching problem. 后者相当慢,因为二分类匹配问题的复杂性至少是二次的。 on the other hand, this method is invariant to scale, rotation, and probably local distortion (for approximate matching, which IMHO is good for the bad example above). 另一方面,这种方法对于缩放,旋转和可能的局部失真是不变的(对于近似匹配,恕我直言,这对于上面的坏例子是好的)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM