简体   繁体   English

使用 OpenCV 从图像中提取多边形给定坐标

[英]Extracting polygon given coordinates from an image using OpenCV

I've a set of points like the following:我有以下几点:

     <data:polygon>
                            <data:point x="542" y="107"/>
                            <data:point x="562" y="102"/>
                            <data:point x="582" y="110"/>
                            <data:point x="598" y="142"/>
                            <data:point x="600" y="192"/>
                            <data:point x="601" y="225"/>
                            <data:point x="592" y="261"/>
                            <data:point x="572" y="263"/>
                            <data:point x="551" y="245"/>
                            <data:point x="526" y="220"/>
                            <data:point x="520" y="188"/>
                            <data:point x="518" y="152"/>
                            <data:point x="525" y="127"/>
                            <data:point x="542" y="107"/
 </data:polygon>

I want to draw the polygon defined by these points in the image and then extract it.我想在图像中绘制由这些点定义的多边形,然后提取它。 How can I do that using OpenCV with python ?我如何使用 OpenCV 和 python 做到这一点?

Use cv2.fillConvexPoly so that you can specify a 2D array of points and define a mask which fills in the shape that is defined by these points to be white in the mask.使用cv2.fillConvexPoly以便您可以指定一个二维点数组并定义一个遮罩,该遮罩将这些点定义的形状填充为遮罩中的白色。 Some fair warning should be made where the points that are defined in your polygon are convex (hence the name fillConvexPoly ).在多边形中定义的点是凸的(因此名称为fillConvexPoly )时,应该发出一些公平的警告。

We can then convert this to a Boolean mask and use this to index into your image to extract out the pixels you want.然后我们可以将其转换为布尔掩码,并使用它来索引您的图像以提取出您想要的像素。 The code below produces an array called mask and this will contain a Boolean mask of the pixels you want to save from the image.下面的代码生成一个名为mask的数组,它将包含要从图像中保存的像素的布尔掩码。 In addition, the array out will contain the desired extracted subimage that was defined by the polygon.此外,数组out将包含由多边形定义的所需提取子图像。 Take note that the image is initialized to be completely dark and that the only pixels that are to be copied over are the pixels defined by the polygon.请注意,图像被初始化为完全黑暗,并且唯一要复制的像素是多边形定义的像素。

Assuming the actual image is called img , and assuming that your x and y points denote the horizontal and vertical coordinates in the image, you can do something like this:假设实际图像称为img ,并假设您的xy点表示图像中的水平和垂直坐标,您可以执行以下操作:

import numpy as np
import cv2

pts = np.array([[542, 107], [562, 102], [582, 110], [598, 142], [600, 192], [601, 225], [592, 261], [572, 263], [551, 245], [526, 220], [520, 188], [518, 152], [525, 127], [524, 107]], dtype=np.int32)

mask = np.zeros((img.shape[0], img.shape[1]))

cv2.fillConvexPoly(mask, pts, 1)
mask = mask.astype(np.bool)

out = np.zeros_like(img)
out[mask] = img[mask]

out should all be black except for the region that is to be copied over.除了要复制的区域外, out应该都是黑色的。 If you want to display this image, you can do something like:如果要显示此图像,可以执行以下操作:

cv2.imshow('Extracted Image', out)
cv2.waitKey(0)
cv2.destroyAllWindows()

This will display the extracted image from the polygon points and wait for a key pressed by you.这将显示从多边形点提取的图像并等待您按下的键。 When you are finished looking at the image, you can push any key as long as the display window has focus.查看完图像后,只要显示窗口有焦点,您就可以按任意键。

If you want to save this image to file, do something like this:如果要将此图像保存到文件,请执行以下操作:

cv2.imwrite('output.png', out)

This will save the image to a file called output.png .这会将图像保存到名为output.png的文件中。 I specify the PNG format because it's lossless.我指定了 PNG 格式,因为它是无损的。


As a simple test, let's define a white image that is 300 x 700 , which is well beyond the largest coordinates in what you have defined.作为一个简单的测试,让我们定义一个300 x 700的白色图像,它远远超出您定义的最大坐标。 Let's extract out the region that's defined by that polygon and show what the output looks like.让我们提取出由该多边形定义的区域并显示输出的样子。

img = 255*np.ones((300, 700, 3), dtype=np.uint8)

Using the above test image, we get this image:使用上面的测试图像,我们得到这个图像:

在此处输入图片说明

Edit编辑

If you would like to translate the extracted image so that it's in the middle, and then place a square around the bounding box, a trick that I can suggest is to use cv2.remap to translate the image.如果您想翻译提取的图像使其位于中间,然后在边界框周围放置一个正方形,我建议的一个技巧是使用cv2.remap来翻译图像。 Once you're done, use cv2.rectangle for drawing the square.完成后,使用cv2.rectangle绘制正方形。

How cv2.remap works is that for each pixel in the output, you need to specify the spatial coordinate of where you want to access a pixel in the source image. cv2.remap工作原理是,对于输出中的每个像素,您需要指定要访问源图像中像素位置的空间坐标。 Because you're ultimately moving the output to the centre of the image, you need to add an offset to every x and y location in the destination image to get the source pixel.因为您最终将输出移动到图像的中心,所以您需要为目标图像中的每个xy位置添加一个偏移量以获得源像素。

To figure out the right offsets to move the image, simply figure out the centroid of the polygon, translate the polygon so that centroid is at the origin, and then retranslate it so that it's at the centre of the image.要找出移动图像的正确偏移量,只需找出多边形的质心,平移多边形使质心位于原点,然后重新平移使其位于图像的中心。

Using the variables we defined above, you can find the centroid by:使用我们上面定义的变量,您可以通过以下方式找到质心:

(meanx, meany) = pts.mean(axis=0)

Once you find the centroid, you take all points and subtract by this centroid, then add the appropriate coordinates to retranslate to the centre of the image.找到质心后,取所有点并减去该质心,然后添加适当的坐标以重新平移到图像的中心。 The centre of the image can be found by:可以通过以下方式找到图像的中心:

(cenx, ceny) = (img.shape[1]/2, img.shape[0]/2)

It's also important that you convert the coordinates into integer as the pixel coordinates are such:将坐标转换为整数也很重要,因为像素坐标是这样的:

(meanx, meany, cenx, ceny) = np.floor([meanx, meany, cenx, ceny]).astype(np.int32)

Now to figure out the offset, do this like we talked about before:现在要计算偏移量,请按照我们之前讨论的方式执行此操作:

(offsetx, offsety) = (-meanx + cenx, -meany + ceny)

Now, translate your image.现在,翻译您的图像。 You need to define a mapping for each pixel in the output image where for each point (x,y) in the destination image, you need to provide where to sample from the source.您需要为输出图像中的每个像素定义一个映射,其中对于目标图像中的每个点(x,y) ,您需要提供从源中采样的位置。 The offset that we calculated translates each source pixel to the destination location.我们计算的偏移量将每个源像素转换为目标位置。 Because we're doing the opposite , where for each destination pixel, we are finding which source pixel to sample from, we must subtract the offset, not add.因为我们正在做相反的事情,对于每个目标像素,我们要找到要从中采样的源像素,我们必须减去偏移量,而不是添加。 Therefore, first define a grid of (x,y) points normally, then subtract the offset.因此,首先通常定义一个(x,y)点的网格,然后减去偏移量。 Once you're done, translate the image:完成后,翻译图像:

(mx, my) = np.meshgrid(np.arange(img.shape[1]), np.arange(img.shape[0]))
ox = (mx - offsetx).astype(np.float32)
oy = (my - offsety).astype(np.float32)
out_translate = cv2.remap(out, ox, oy, cv2.INTER_LINEAR)

If we displayed out_translate with the above example, this is what we get:如果我们用上面的例子显示out_translate ,我们得到的是:

在此处输入图片说明


Cool!凉爽的! Now it's time to draw the rectangle on top of this image.现在是时候在此图像上绘制矩形了。 All you have to do is figure out the top left and bottom right corner of the rectangle.你所要做的就是找出矩形的左上角和右下角。 This can be done by taking the top left and bottom right corners of the polygon and adding the offset to move these points to the centre of the image:这可以通过获取多边形的左上角和右下角并添加偏移量以将这些点移动到图像的中心来完成:

topleft = pts.min(axis=0) + [offsetx, offsety]
bottomright = pts.max(axis=0) + [offsetx, offsety]
cv2.rectangle(out_translate, tuple(topleft), tuple(bottomright), color=(255,0,0))

If we show this image, we get:如果我们展示这张图片,我们会得到:

在此处输入图片说明


The above code draws a rectangle around the centered image with a blue colour.上面的代码在居中的图像周围绘制一个蓝色的矩形。 As such, the full code to go from the start (extracting the pixel region) to the end (translating and drawing a rectangle) is:因此,从开始(提取像素区域)到结束(平移和绘制矩形)的完整代码是:

# Import relevant modules
import numpy as np
import cv2

# Define points
pts = np.array([[542, 107], [562, 102], [582, 110], [598, 142], [600, 192], [601, 225], [592, 261], [572, 263], [551, 245], [526, 220], [520, 188], [518, 152], [525, 127], [524, 107]], dtype=np.int32)

### Define image here
img = 255*np.ones((300, 700, 3), dtype=np.uint8)

# Initialize mask
mask = np.zeros((img.shape[0], img.shape[1]))

# Create mask that defines the polygon of points
cv2.fillConvexPoly(mask, pts, 1)
mask = mask.astype(np.bool)

# Create output image (untranslated)
out = np.zeros_like(img)
out[mask] = img[mask]

# Find centroid of polygon
(meanx, meany) = pts.mean(axis=0)

# Find centre of image
(cenx, ceny) = (img.shape[1]/2, img.shape[0]/2)

# Make integer coordinates for each of the above
(meanx, meany, cenx, ceny) = np.floor([meanx, meany, cenx, ceny]).astype(np.int32)

# Calculate final offset to translate source pixels to centre of image
(offsetx, offsety) = (-meanx + cenx, -meany + ceny)

# Define remapping coordinates
(mx, my) = np.meshgrid(np.arange(img.shape[1]), np.arange(img.shape[0]))
ox = (mx - offsetx).astype(np.float32)
oy = (my - offsety).astype(np.float32)

# Translate the image to centre
out_translate = cv2.remap(out, ox, oy, cv2.INTER_LINEAR)

# Determine top left and bottom right of translated image
topleft = pts.min(axis=0) + [offsetx, offsety]
bottomright = pts.max(axis=0) + [offsetx, offsety]

# Draw rectangle
cv2.rectangle(out_translate, tuple(topleft), tuple(bottomright), color=(255,0,0))

# Show image, wait for user input, then save the image
cv2.imshow('Output Image', out_translate)
cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.imwrite('output.png', out_translate)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM