简体   繁体   中英

Detect and remove zero padding within an image array?

Is there a way to detect and remove zero padding within an image array? In a way my question is very similar to this except the image has already been rotated and I do not know the angle.

I am basically cropping a box out of a larger image which may have zero padding around the edges (due to translations or rotations). Now it's possible that the crop may contain some of this padding. However, in such cases, I want to clip the box where the padding edge starts. The images are in a CHW (can be easily changed to HWC).

The paddings in this case will be 0s in all channels. However, due to rotations, it's possible that sometimes, the 0s might not always be in completely horizontal or vertical strips in the array. Is there a way to detect if there are 0s going all the way to the edge in the array and at what location the edges start?

Example 1 where arr is an image with 3 channels and width and height of 4 (3, 4, 4) and the crop contains vertical padding on the rightmost edge:

array([[[1., 1., 1., 0.],
        [1., 1., 1., 0.],
        [1., 1., 1., 0.],
        [1., 1., 1., 0.]],

       [[1., 1., 1., 0.],
        [1., 1., 1., 0.],
        [1., 1., 1., 0.],
        [1., 1., 1., 0.]],

       [[1., 1., 1., 0.],
        [1., 1., 1., 0.],
        [1., 1., 1., 0.],
        [1., 1., 1., 0.]]])

In this example, I would slice the array as such to get rid of the zero padding: arr[:, :, :-1]

Example 2 where we have some padding on the top right corner:

array([[[1., 1., 0., 0.],
        [1., 1., 1., 0.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]],

       [[1., 1., 0., 0.],
        [1., 1., 1., 0.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]],

       [[1., 1., 0., 0.],
        [1., 1., 1., 0.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]]])

In this example, I would clip the image to remove any padding by returning arr2[:, 1:, :-1] .

I want to do this in Tensorflow so tensor operations would be great but I am trying to figure out any algorithm, for example using numpy, that can achieve this result.

If you don't mind throwing away some of the image and are okay with a liberal crop as long as it doesn't contain padding, you can get a quite efficient solution:

pad_value = 0.0
arr = <test_image>
arr_masked = np.all(arr != pad_value , axis=0)
y_low = np.max(np.argmax(arr_masked, axis=0))
x_low = np.max(np.argmax(arr_masked, axis=1))
y_high = np.min(arr_masked.shape[0] - np.argmax(arr_masked[::-1, :], axis=0))
x_high = np.min(arr_masked.shape[1] - np.argmax(arr_masked[:, ::-1], axis=1))
arr[:, y_low:y_high, x_low:x_high]

If it has to be the biggest possible crop then more work is needed. Essentially we have to check every contiguous sub-image if it contains padding and then compare them all for size.

Main Idea : Assume that the top-left corner of the padding free sub-image is at (x1,y1) and the bottom-right corner is at (x2, y2) then we can understand the number of pixels in the subarray as a rank-4 tensor with dimensions [y1, x1, y2, x2] . We set the number of pixels to 0 if the combination is not a valid sub-image, ie, if it has a negative width or height, or it contains a padded pixel.

pad_value = 0.0
arr = <test_image>

# indices for sub-image tensor
y = np.arange(arr_masked.shape[0])
x = np.arange(arr_masked.shape[1])
y1 = y[:, None, None, None]
y2 = y[None, None, :, None]
x1 = x[None, :, None, None]
x2 = x[None, None, None, :]

# coordinates of padded pixels
arr_masked = np.all(arr != pad_value , axis=0)
pad_north = np.argmax(arr_masked, axis=0)
pad_west = np.argmax(arr_masked, axis=1)
pad_south = arr_masked.shape[0] - np.argmax(arr_masked[::-1, :], axis=0)
pad_east = arr_masked.shape[1] - np.argmax(arr_masked[:, ::-1], axis=1)

is_padded = np.zeros_like(arr_masked)
is_padded[y[:, None] < pad_north[None, :]] = True
is_padded[y[:, None] >= pad_south[None, :]] = True
is_padded[x[None, :] < pad_west[:, None]] = True
is_padded[x[None, :] >= pad_east[:, None]] = True

y_padded, x_padded = np.where(is_padded)
y_padded = y_padded[None, None, None, None, :]
x_padded = x_padded[None, None, None, None, :]

# size of the sub-image
height = np.clip(y2 - y1 + 1, 0, None)
width = np.clip(x2 - x1 + 1, 0, None)
img_size = width * height

# sub-image contains at least one padded pixel
y_inside = np.logical_and(y1[..., None] <= y_padded, y_padded<= y2[..., None])
x_inside = np.logical_and(x1[..., None] <= x_padded, x_padded<= x2[..., None])
contains_border = np.any(np.logical_and(y_inside, x_inside), axis=-1)

# ignore sub-images containing padded pixels
img_size[contains_border] = 0

# find all largest sub-images
tmp = np.where(img_size == np.max(img_size))
rectangles = (tmp[0], tmp[1], tmp[2]+1, tmp[3]+1)

Now rectangles contains all the corners for the sub-images that have the largest number of pixels without containing any border pixels. It is already quite vectorized, so you should be able to migrate this from numpy to tensorflow.

Please try this solution:

def remove_zero_pad(image):
    dummy = np.argwhere(image != 0) # assume blackground is zero
    max_y = dummy[:, 0].max()
    min_y = dummy[:, 0].min()
    min_x = dummy[:, 1].min()
    max_x = dummy[:, 1].max()
    crop_image = image[min_y:max_y, min_x:max_x]

    return crop_image

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM