提取带有填充的矩形框

Question

I'm trying to extract values From a 2d Tensor inside multiple rectangular regions. 我正在尝试从多个矩形区域内的2d张量提取值。 I want to crop rectangular regions while setting all values outside the box to zero. 我想裁剪矩形区域，同时将框外的所有值设置为零。

For example from the 9 x 9 image I want to get two separate images with values inside the two rectangular red boxes, while setting the rest of the values to zero. 例如，从9 x 9图像中，我想获得两个单独的图像，两个矩形红色框中的值相同，而将其余值设置为零。 Is there a convenient way to do this with tensorflow slicing? 有没有一种方便的方法可以使用张量流切片来做到这一点？

One way I thought of approaching this is defining a mask array that is 1 inside the box and 0 outside and multiply it with the input array. 我想到的一种方法是定义一个掩码数组，该掩码数组在框内为1，在框外为0，并将其与输入数组相乘。 But this requires looping over the number of boxes, each time changing which values of the mask are set to 0. Is there a faster and more efficient way to do this without using for loops? 但是，这需要遍历框的数量，每次更改将掩码的哪些值设置为0时，是否有更快，更有效的方法来执行此操作而不使用for循环？ Is there an equivalent of crop and replace function in tensorflow? 张量流中是否有等效的裁剪和替换功能？ Here's the code I have with the for loop. 这是我使用for循环的代码。 Appreciate any input on this. 感谢对此的任何投入。 Thanks 谢谢

import tensorflow as tf
import matplotlib.pyplot as plt
import matplotlib.patches as patches

tf.reset_default_graph()

size = 9 # size of input image
num_boxes = 2 # number of rectangular boxes


def get_cutout(X, bboxs):
    """Returns copies of X with values only inside bboxs"""
    out = []
    for i in range(num_boxes):
        bbox = bboxs[i] # get rectangular box coordinates
        Y = tf.Variable(np.zeros((size, size)), dtype=tf.float32) # define temporary mask
        # set values of mask inside box to 1
        t = [Y[bbox[0]:bbox[2], bbox[2]:bbox[3]].assign(
            tf.ones((bbox[2]-bbox[0], bbox[3]-bbox[2])))]
        with tf.control_dependencies(t):
            mask = tf.identity(Y) 
        out.append(X * mask) # get values inside rectangular box
    return out, X

#define a 9x9 input image X and convert to tensor
in_x = np.eye(size)
in_x[0:3]=np.random.rand(3,9)
X = tf.constant(in_x , dtype=tf.float32)

bboxs = tf.placeholder(tf.int32, [None, 4]) # placeholder for rectangular box

X_outs = get_cutout(X, bboxs)

# coordintes of box ((bottom left x, bottom left y, top right x, top right y))
in_bbox = [[1,3,3,6], [4,3,7,8]] 
feed_dict = {bboxs: in_bbox}

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    x_out= sess.run(X_outs, feed_dict=feed_dict)

# plot results
vmin = np.min(x_out[2])
vmax = np.max(x_out[2])
fig, ax = plt.subplots(nrows=1, ncols=1+len(in_bbox),figsize=(10,2))
im = ax[0].imshow(x_out[2], vmin=vmin, vmax=vmax, origin='lower')
plt.colorbar(im, ax=ax[0])
ax[0].set_title("input X")
for i, bbox in enumerate(in_bbox):
    bottom_left = (bbox[2]-0.5, bbox[0]-0.5)
    width = bbox[3]-bbox[2]
    height = bbox[2]- bbox[0]
    rect = patches.Rectangle(bottom_left, width, height,
                             linewidth=1,edgecolor='r',facecolor='none')
    ax[0].add_patch(rect)
    ax[i+1].set_title("extract values in box {}".format(i+1))
    im = ax[i + 1].imshow(x_out[0][i], vmin=vmin, vmax=vmax, origin='lower')
    plt.colorbar(im,ax=ax[i+1])

Answer 1

The mask can be created using tf.pad . 可以使用tf.pad创建tf.pad 。

 crop = tf.ones([3, 3])
 # "before_axis_x" how many padding will be added before cropping zone over the axis x
 # "after_axis_x" how many padding will be added after cropping zone over the axis x
 mask = tf.pad(crop, [[before_axis_0, after_axis_0], [before_axis_1, after_axis_1]]

 tf.mask(image, mask) # creates the extracted image

To have the same behavior as tf.image.crop_and_resize, here is a function that will take an array of boxes and will return an array of extracted images with padding. 要具有与tf.image.crop_and_resize相同的行为，此函数将采用一组框并返回一组带有填充的提取图像。

def extract_with_padding(image, boxes):
  """
   boxes: tensor of shape [num_boxes, 4]. 
          boxes are the coordinates of the extracted part
          box is an array [y1, x1, y2, x2] 
          where [y1, x1] (respectively [y2, x2]) are the coordinates 
          of the top left (respectively bottom right ) part of the image
   image: tensor containing the initial image
  """
  extracted = []
  shape = tf.shape(image)
  for b in boxes:
    crop = tf.ones([3, 3])

    mask = tf.pad(crop, [[b[0], shape[0] - b[2]], [b[1] , shape[1] - b[3]]])
    extracted.append(tf.boolean_mask(image, mask))

  return extracted

Answer 2

Thanks for that really nice function @edkevekeh. 感谢您的@edkevekeh这个非常好的功能。 I've had to modify it slightly to get it to do what I want. 我必须对其稍加修改才能使其执行我想要的操作。 One, I couldn't iterate over boxes which is a Tensor object. 一个，我无法遍历作为Tensor对象的盒子。 Plus, the crop size is determined by the box and not always 3x3. 另外，农作物的大小由方框决定，而不是3x3。 Also, tf.boolean_mask returns the crop, but I want to keep the crop, but replace outside the crop with 0. So I replaced the tf.boolean_mask with multiplication. 另外，tf.boolean_mask返回裁切，但是我想保留裁切，但是在裁切之外将其替换为0。因此，我用乘法替换了tf.boolean_mask。

For my use case num_boxes can be large, so I wanted to know if there was a more efficient way than a for loop, guess not. 对于我的用例，num_boxes可能很大，所以我想知道是否有比for循环更有效的方法，请不要猜测。 My modified version of @edkevekeh's solution if anyone else needs it. @edkevekeh解决方案的修改版（如果有人需要）。

def extract_with_padding(image, boxes):
    """
    boxes: tensor of shape [num_boxes, 4]. 
          boxes are the coordinates of the extracted part
          box is an array [y1, x1, y2, x2] 
          where [y1, x1] (respectively [y2, x2]) are the coordinates 
          of the top left (respectively bottom right ) part of the image
    image: tensor containing the initial image
    """
    extracted = []
    shape = tf.shape(image)
    for i in range(boxes.shape[0]):
        b = boxes[i]
        crop = tf.ones([b[2] - b[0], b[3] - b[1]])
        mask = tf.pad(crop, [[b[0], shape[0] - b[2]], [b[1] , shape[1] - b[3]]])
        extracted.append(image*mask)
    return extracted

提取带有填充的矩形框

问题描述

2 个解决方案

解决方案1
0 2019-03-14 01:38:08

解决方案2
0 2019-03-15 20:49:12

提取带有填充的矩形框

问题描述

2 个解决方案

解决方案1 0 2019-03-14 01:38:08

解决方案2 0 2019-03-15 20:49:12

解决方案1
0 2019-03-14 01:38:08

解决方案2
0 2019-03-15 20:49:12