[英]How do I apply a mask to an image in a tf.data pipeline?
我的 tensorflow 输入管道正在读取图像以用于去马赛克网络。 给定一个维度为 (H, W, 3) 的图像张量,我想生成一个维度为 (H, W, 3) 的“镶嵌”图像,其中 2/3 的原始信息被清零。 这意味着我想对图像数据数组应用掩码。 例如,一个 2x2 的图像
[[ [R_11, G_11, B_11], [R_12, G_12, B_12] ],
[ [R_21, G_21, B_21], [R_22, G_22, B_22] ]]
应该成为
[[ [R_11, 0, 0], [0, G_12, 0] ],
[ [0, G_21, 0], [0, 0, B_22] ]]
通过逐元素掩码
[[ [1, 0, 0], [0, 1, 0] ],
[ [0, 1, 0], [0, 0, 1] ]].
当然,我不仅要处理 2x2 图像,还要处理更大的图像。 我想应用的蒙版只是 2x2 蒙版的平铺,以满足图像的尺寸。 这在 numpy 中并不难做到,但是当我尝试对tf.data
管道中的张量进行操作时,出现错误:
NotImplementedError: Cannot convert a symbolic Tensor (args_0:0) to a numpy array.
从以下代码:
def bayerize_3d(t_img):
"""
Inputs:
img: tensor of dimension [H, W, 3], where H and W are divisible by 2, color channels are RGB
Outputs:
bayer: bayerized version of img, a tensor of dimension [H, W, 3] (1/3 of the former information)
"""
np_mask = np.zeros_like(t_img)
np_mask[0::2, 0::2, 0] = 1
np_mask[0::2, 1::2, 1] = 1
np_mask[1::2, 0::2, 1] = 1
np_mask[1::2, 1::2, 2] = 1
t_mask = tf.convert_to_tensor(np_mask)
t_bayer = tf.math.multiply(t_img, t_mask)
return t_bayer
对于上下文,我使用 tensorflow 2,并从以下调用bayerize_3d
:
def decode_img(img):
"""Taken from https://www.tensorflow.org/tutorials/load_data/images"""
# convert the compressed string to a 3D uint8 tensor
img = tf.image.decode_png(img, channels=3)
# Use `convert_image_dtype` to convert to floats in the [0,1] range.
img = tf.image.convert_image_dtype(img, tf.float32)
return img
def load_vimeo_data(data_dir, dir_list, bayer_option='3d'):
"""Loads Vimeo-90K training / test data as tensorflow dataset, returns tuple of x and y datasets
data_dir - path to vimeo90k training / test dataset directory
dir_list - path to file containing relative directory paths to use for this dataset
bayerize - either '3d' or '4d' - should """
# ugly python 3.5 workaround
target_dir = str( pathlib.Path(data_dir) / 'sequences' )
#directories = list(map(lambda x: str(x), target_dir.glob('*/*')))
directories = []
with open(dir_list, 'rt') as f:
for line in f:
directories.append(str(pathlib.Path(target_dir) / line.strip()))
im1str = tf.constant('/im1.png')
im2str = tf.constant('/im2.png')
im3str = tf.constant('/im3.png')
bayerize = None
if bayer_option == '3d':
bayerize = bayerize_3d
elif bayer_option == '4d':
bayerize = bayerize_4d
def process_dir(dir_path):
im1 = decode_img(tf.io.read_file(tf.strings.join([dir_path, im1str])))
im2 = decode_img(tf.io.read_file(tf.strings.join([dir_path, im2str])))
im3 = decode_img(tf.io.read_file(tf.strings.join([dir_path, im3str])))
return im1, im2, im3
def bayerize_stack(im1, im2, im3):
b_im1 = bayerize(im1)
b_im2 = bayerize(im2)
b_im3 = bayerize(im3)
return tf.stack([b_im1, b_im2, b_im3], axis=0)
def extract_middle(im1, im2, im3):
im2 = tf.stack([im2], axis=0)
return im2
def process_stack(im1, im2, im3):
return bayerize_stack(im1, im2, im3), extract_middle(im1, im2, im3)
dirs = tf.data.Dataset.from_tensor_slices(directories)
dirs_dataset = dirs.map(process_dir)
# dataset is bayerized stack "labeled" with true middle frame
dataset = dirs_dataset.map(process_stack)
return dataset
当然有合适的方法来做到这一点吗?
问题似乎来自试图从张量创建 numpy 数组。
我的建议是明确地将图像形状传递到您的数据集管道中,并生成具有这些已知维度的 numpy 数组。
或者,您可以设置输入张量的形状,例如
h = 500
w = 500
t_img.set_shape(h, w, 3)
np_mask = np.zeros(t_img.shape)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.