简体   繁体   中英

How to get an object bounding box given pixel label in python?

Say I have a scene parsing map for an image, each pixel in this scene parsing map indicates which object this pixel belongs to. Now I want to get bounding box of each object, how can I implement this in python? For a detail example, say I have a scene parsing map like this:

0 0 0 0 0 0 0
0 1 1 0 0 0 0
1 1 1 1 0 0 0
0 0 1 1 1 0 0
0 0 1 1 1 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0

So the bounding box is:

0 0 0 0 0 0 0
1 1 1 1 1 0 0
1 0 0 0 1 0 0
1 0 0 0 1 0 0
1 1 1 1 1 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0

Actually, in my task, just know the width and height of this object is enough.

A basic idea is to search four edges in the scene parsing map, from top, bottom, left and right direction. But there might be a lot of small objects in the image, this way is not time efficient.

A second way is to calculate the coordinates of all non-zero elements and find the max/min x/y. Then calculate weight and height using these x and y.

Is there any other more efficient way to do this? Thx.

If you are processing images, you can use scipy's ndimage library.

If there is only one object in the image, you can get the measurements with scipy.ndimage.measurements.find_objects ( http://docs.scipy.org/doc/scipy-0.16.1/reference/generated/scipy.ndimage.measurements.find_objects.html ):

import numpy as np
from scipy import ndimage
a = np.array([[0, 0, 0, 0, 0, 0, 0],
              [0, 1, 1, 0, 0, 0, 0],
              [1, 1, 1, 1, 0, 0, 0],
              [0, 0, 1, 1, 1, 0, 0],
              [0, 0, 1, 1, 1, 0, 0],
              [0, 0, 0, 0, 0, 0, 0],
              [0, 0, 0, 0, 0, 0, 0]])

# Find the location of all objects
objs = ndimage.find_objects(a)

# Get the height and width
height = int(objs[0][0].stop - objs[0][0].start)
width = int(objs[0][1].stop - objs[0][1].start)

If there are many objects in the image, you first have to label each object and then get the measurements:

import numpy as np
from scipy import ndimage
a = np.array([[0, 0, 0, 0, 0, 0, 0],
              [0, 1, 1, 0, 0, 0, 0],
              [1, 1, 1, 1, 0, 0, 0],
              [0, 0, 1, 1, 1, 0, 0],
              [0, 0, 1, 1, 1, 0, 0],
              [0, 0, 0, 0, 0, 0, 0],
              [0, 0, 1, 1, 1, 0, 0]])  # Second object here
# Label objects
labeled_image, num_features = ndimage.label(a)
# Find the location of all objects
objs = ndimage.find_objects(labeled_image)
# Get the height and width
measurements = []
for ob in objs:
    measurements.append((int(ob[0].stop - ob[0].start), int(ob[1].stop - ob[1].start)))

If you check ndimage.measurements, you can get more measurements: center of mass, area...

using numpy:

import numpy as np

ind = np.nonzero(arr.any(axis=0))[0] # indices of non empty columns 
width = ind[-1] - ind[0] + 1
ind = np.nonzero(arr.any(axis=1))[0] # indices of non empty rows
height = ind[-1] - ind[0] + 1

a bit more explanation:

arr.any(axis=0) gives a boolean array telling you if the columns are empty ( False ) or not ( True ). np.nonzero(arr.any(axis=0))[0] then extract the non zero (ie True ) indices from that array. ind[0] is the first element of that array, hence the left most column non empty column and ind[-1] is the last element, hence the right most non empty column. The difference then gives the width, give or take 1 depending on whether you include the borders or not. Similar stuff for the height but on the other axis.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM