简体   繁体   English

使用边界框列表从图像中裁剪多个边界框

[英]Crop multiple bounding boxes from image with list of bounding boxes

Using Amazon's Rekognition, I have extracted the bounding boxes of interest from the JSON response using the following:使用 Amazon 的 Rekognition,我使用以下内容从 JSON 响应中提取了感兴趣的边界框:

    def __init__(self, image):
        self.shape = image.shape 

    def bounding_box_convert(self, bounding_box):

        xmin = int(bounding_box['Left'] * self.shape[1])
        xmax = xmin + int(bounding_box['Width'] * self.shape[1])
        ymin = int(bounding_box['Top'] * self.shape[0])
        ymax = ymin + int(bounding_box['Height'] * self.shape[0])

        return (xmin,ymin,xmax,ymax)

    def polygon_convert(self, polygon):
        pts = []
        for p in polygon:
            x = int(p['X'] * self.shape[1])
            y = int(p['Y'] * self.shape[0])
            pts.append( [x,y] )

        return pts

def get_bounding_boxes(jsondata):
    objectnames = ('Helmet','Hardhat')
    bboxes = []
    a = jsondata
    if('Labels' in a):
        for label in a['Labels']:

            #-- skip over anything that isn't hardhat,helmet
            if(label['Name'] in objectnames):
                print('extracting {}'.format(label['Name']))


                lbl = "{}: {:0.1f}%".format(label['Name'], label['Confidence'])
                print(lbl)

                for instance in label['Instances']:
                    coords = tmp.bounding_box_convert(instance['BoundingBox'])
                    bboxes.append(coords)

    return bboxes

if __name__=='__main__':

    imagefile = 'image011.jpg'
    bgr_image = cv2.imread(imagefile)
    tmp = Tmp(bgr_image)

    jsonname = 'json_000'
    fin = open(jsonname, 'r')

    jsondata = json.load(fin)
    bb = get_bounding_boxes(jsondata)
    print(bb)

The output is a list of bounding boxes:输出是一个边界框列表:

[(865, 731, 1077, 906), (1874, 646, 2117, 824)]

I am able to easily extract one position from the list and save as a new image using:我可以轻松地从列表中提取一个位置并使用以下方法另存为新图像:

from PIL import Image
img = Image.open("image011.jpg")
area = (865, 731, 1077, 906)
cropped_img = img.crop(area)
cropped_img.save("cropped.jpg")

However, I haven't found a good solution to crop and save multiple bounding boxes from the image using the 'bb' list output.但是,我还没有找到使用“bb”列表输出从图像中裁剪和保存多个边界框的好的解决方案。

I did find a solution that extracts the information from a csv here: Most efficient/quickest way to crop multiple bounding boxes in 1 image, over thousands of images?我确实在这里找到了从 csv 中提取信息的解决方案: 在 1 张图像中裁剪多个边界框的最有效/最快捷的方法,超过数千张图像? . .

But, I believe there is a more efficient way than saving the bounding box data to a csv and reading it back in.但是,我相信有一种比将边界框数据保存到 csv 并重新读入更有效的方法。

I'm not very strong at writing my own functions - all suggestions are much appreciated!我不太擅长编写自己的函数 - 非常感谢所有建议!

Assuming your bounding box coordinates are in the form of x,y,w,h you can do ROI = image[y:y+h,x:x+w] to crop.假设你的边界框坐标是x,y,w,h的形式x,y,w,h你可以做ROI = image[y:y+h,x:x+w]来裁剪。 With this input image:使用此输入图像:

在此处输入图片说明

Using the script from how to get ROI Bounding Box Coordinates without Guess & Check to obtain the x,y,w,h bounding box coordinates to crop out these ROIs:使用如何获得 ROI Bounding Box Coordinates without Guess & Check 中的脚本来获取x,y,w,h边界框坐标来裁剪这些 ROI:

在此处输入图片说明

We simply iterate through the bounding box list and crop it using Numpy slicing.我们简单地遍历边界框列表并使用 Numpy 切片对其进行裁剪。 The extracted ROIs:提取的投资回报率:

在此处输入图片说明

Here's a minimum example:这是一个最小的例子:

import cv2
import numpy as np 

image = cv2.imread('1.png')
bounding_boxes = [(17, 24, 47, 47),
                  (74, 28, 47, 50),
                  (125, 15, 51, 61),
                  (184, 18, 53, 53),
                  (247, 25, 44, 46),
                  (296, 6, 65, 66)
]

num = 0
for box in bounding_boxes:
    x,y,w,h = box
    ROI = image[y:y+h, x:x+w]
    cv2.imwrite('ROI_{}.png'.format(num), ROI)
    num += 1
    cv2.imshow('ROI', ROI)
    cv2.waitKey()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM