如何将蒙版应用于 tf.data 管道中的图像？

Question

我的 tensorflow 输入管道正在读取图像以用于去马赛克网络。 给定一个维度为 (H, W, 3) 的图像张量，我想生成一个维度为 (H, W, 3) 的“镶嵌”图像，其中 2/3 的原始信息被清零。 这意味着我想对图像数据数组应用掩码。 例如，一个 2x2 的图像

[[ [R_11, G_11, B_11], [R_12, G_12, B_12] ],
 [ [R_21, G_21, B_21], [R_22, G_22, B_22] ]]

应该成为

[[ [R_11, 0, 0], [0, G_12, 0] ],
 [ [0, G_21, 0], [0, 0, B_22] ]]

通过逐元素掩码

[[ [1, 0, 0], [0, 1, 0] ],
 [ [0, 1, 0], [0, 0, 1] ]].

当然，我不仅要处理 2x2 图像，还要处理更大的图像。 我想应用的蒙版只是 2x2 蒙版的平铺，以满足图像的尺寸。 这在 numpy 中并不难做到，但是当我尝试对tf.data管道中的张量进行操作时，出现错误：

NotImplementedError: Cannot convert a symbolic Tensor (args_0:0) to a numpy array.

从以下代码：

def bayerize_3d(t_img):
    """
    Inputs:
        img: tensor of dimension [H, W, 3], where H and W are divisible by 2, color channels are RGB
    Outputs:
        bayer: bayerized version of img, a tensor of dimension [H, W, 3] (1/3 of the former information)
    """    
    np_mask = np.zeros_like(t_img)
    np_mask[0::2, 0::2, 0] = 1
    np_mask[0::2, 1::2, 1] = 1
    np_mask[1::2, 0::2, 1] = 1
    np_mask[1::2, 1::2, 2] = 1
    t_mask = tf.convert_to_tensor(np_mask)

    t_bayer = tf.math.multiply(t_img, t_mask)

    return t_bayer

对于上下文，我使用 tensorflow 2，并从以下调用bayerize_3d ：

def decode_img(img):
    """Taken from https://www.tensorflow.org/tutorials/load_data/images"""
    # convert the compressed string to a 3D uint8 tensor
    img = tf.image.decode_png(img, channels=3)
    # Use `convert_image_dtype` to convert to floats in the [0,1] range.
    img = tf.image.convert_image_dtype(img, tf.float32)
    return img

def load_vimeo_data(data_dir, dir_list, bayer_option='3d'):
    """Loads Vimeo-90K training / test data as tensorflow dataset, returns tuple of x and y datasets
    data_dir - path to vimeo90k training / test dataset directory
    dir_list - path to file containing relative directory paths to use for this dataset
    bayerize - either '3d' or '4d' - should """
    # ugly python 3.5 workaround
    target_dir = str( pathlib.Path(data_dir) / 'sequences' )
    #directories = list(map(lambda x: str(x), target_dir.glob('*/*')))

    directories = []
    with open(dir_list, 'rt') as f:
        for line in f:
            directories.append(str(pathlib.Path(target_dir) / line.strip()))

    im1str = tf.constant('/im1.png')
    im2str = tf.constant('/im2.png')
    im3str = tf.constant('/im3.png')

    bayerize = None
    if bayer_option == '3d':
        bayerize = bayerize_3d
    elif bayer_option == '4d':
        bayerize = bayerize_4d

    def process_dir(dir_path):
        im1 = decode_img(tf.io.read_file(tf.strings.join([dir_path, im1str])))
        im2 = decode_img(tf.io.read_file(tf.strings.join([dir_path, im2str])))
        im3 = decode_img(tf.io.read_file(tf.strings.join([dir_path, im3str])))
        return im1, im2, im3

    def bayerize_stack(im1, im2, im3):
        b_im1 = bayerize(im1)
        b_im2 = bayerize(im2)
        b_im3 = bayerize(im3)
        return tf.stack([b_im1, b_im2, b_im3], axis=0)

    def extract_middle(im1, im2, im3):
        im2 = tf.stack([im2], axis=0)
        return im2

    def process_stack(im1, im2, im3):
        return bayerize_stack(im1, im2, im3), extract_middle(im1, im2, im3)

    dirs = tf.data.Dataset.from_tensor_slices(directories)
    dirs_dataset = dirs.map(process_dir)
    # dataset is bayerized stack "labeled" with true middle frame
    dataset = dirs_dataset.map(process_stack)
    return dataset

当然有合适的方法来做到这一点吗？

Answer 1

问题似乎来自试图从张量创建 numpy 数组。

我的建议是明确地将图像形状传递到您的数据集管道中，并生成具有这些已知维度的 numpy 数组。

或者，您可以设置输入张量的形状，例如

h = 500
w = 500
t_img.set_shape(h, w, 3)
np_mask = np.zeros(t_img.shape)

如何将蒙版应用于 tf.data 管道中的图像？

问题描述

1 个解决方案

解决方案1
0 已采纳 2020-02-17 18:51:29

如何将蒙版应用于 tf.data 管道中的图像？

问题描述

1 个解决方案

解决方案1 0 已采纳 2020-02-17 18:51:29

解决方案1
0 已采纳 2020-02-17 18:51:29