tf.nn.conv2d_transpose() 输入张量动态形状的双倍宽度和高度

Question

When I tried to use tf.nn.conv2d_transpose() to get layer result which has doubled width, height and halved depth, it worked while using specified [batch, width, height, channel(input and output)].当我尝试使用tf.nn.conv2d_transpose()来获得宽度、高度和深度减半的图层结果时，它在使用指定的 [batch, width, height, channel(input and output)] 时起作用。

By setting batch_size="None", training works well for specified batch_size and validation works well for images(or an image).通过设置batch_size="None"，训练适用于指定的 batch_size，验证适用于图像（或图像）。

Now, I'm trying to make encoder-decoder network structure using training [128 x 128 x 3] images.现在，我正在尝试使用训练[128 x 128 x 3]图像制作编码器-解码器网络结构。 (Those training images are cropped images from [wxhx 3] original images) （这些训练图像是从[wxhx 3]原始图像中裁剪的图像）

Input shape is [128 x 128 x 3] and output shape is [128 x 128 x 3] .输入形状是[128 x 128 x 3]和 output 形状是[128 x 128 x 3] 。 First layer is convolution layer with k=3x3, strides = 1, padding = 1 with encoder-decoder structure.第一层是卷积层，k=3x3，strides = 1，padding = 1，编码器-解码器结构。

All of the process above works well for specified width and height (128 x 128).上述所有过程都适用于指定的宽度和高度 (128 x 128)。

However, after training is finished with training patches of [128 x 128 x 3] , I'd like to infer [wxhx 3] image using trained network.但是，在使用[128 x 128 x 3]的训练补丁完成训练后，我想使用经过训练的网络推断[wxhx 3]图像。

I guess all of the sequences: convolution, maxpool gives correct results, except transpose convolution.我猜所有的序列：卷积，maxpool 给出了正确的结果，除了转置卷积。

When I infer fixed shape of images [128 x 128 x 3] :当我推断图像的固定形状[128 x 128 x 3]时：

InputImagesTensor = tf.placeholder(tf.float32, [None, 128, 128, 3], name='InputImages')
ResultImages = libs.Network(InputImagesTensor)
saver = tf.train.Saver()
w = 128
h = 128

sess = tf.Session()
sess.run(tf.global_variables_initializer())
saver.restore(sess, 'output.ckpt')
for i in range(0, len(Datas.InputImageNameList)):
    temp = np.resize(getResizedImage(Datas.InputImageList[i]), (1, 128, 128, 3))
    resultimg = sess.run(ResultImages, feed_dict={InputImagesTensor: temp})

with network inside:内部有网络：

def Transpose2d(input, inC, outC):
    b, w, h, c = input.shape

    batch_size = tf.shape(input)[0]
    deconv_shape = tf.stack([batch_size, int(w*2), int(h*2), outC])
    kernel = tf.Variable(tf.random_normal([2, 2, outC, inC], stddev=0.01))
    output_shape = [None, int(w * 2), int(h * 2), outC]
    transConv = tf.nn.conv2d_transpose(input, kernel, output_shape=deconv_shape, strides=[1, 2, 2, 1], padding="SAME")

    return transConv

Now, I tried to convert them with fixed width and height to dynamic width and height.现在，我尝试将具有固定宽度和高度的它们转换为动态宽度和高度。 In my opinion, this would work (however it failed)在我看来，这会起作用（但它失败了）

Change改变

InputImagesTensor = tf.placeholder(tf.float32, [None, 128, 128, 3], name='InputImages')
temp = np.resize(getResizedImage(Datas.InputImageList[i]), (1, 128, 128, 3))

to至

InputImagesTensor = tf.placeholder(tf.float32, [None, None, None, 3], name='InputImages')
temp = np.resize(getResizedImage(Datas.InputImageList[i]), (1, w, h, 3))

However, this line gives error.但是，这条线给出了错误。

deconv_shape = tf.stack([batch_size, int(w*2), int(h*2), outC])

TypeError: int returned non-int (type NoneType) TypeError： int返回非int（类型NoneType）

I guess it is because we cannot double the None value to 2*None .我想这是因为我们不能将None值加倍为2*None 。

How can I do this??我怎样才能做到这一点？？ is it possible??可能吗？？

Answer 1

Self answer...自答...

I cannot come up with suitable 'standard' solution to set w, h as None property for transpose convolution.我无法想出合适的“标准”解决方案来将 w、h 设置为转置卷积的 None 属性。

However I solved problem by giving transpose convolution's shape as maximum shape of my training/validation images.但是，我通过将转置卷积的形状作为训练/验证图像的最大形状来解决问题。 For example, if the maximum width and height of my images = [656 x 656], and a test image = [450 x 656], then I create zeros np.ndarray of [656 x 656] and just fill [450 x 656] region using test image RGB.例如，如果我的图像的最大宽度和高度 = [656 x 656]，并且测试图像 = [450 x 656]，那么我创建 [656 x 656] 的零 np.ndarray 并填充 [450 x 656 ] 区域使用测试图像 RGB。 (Apply concept of zero-padding) （应用零填充的概念）

tf.nn.conv2d_transpose() 输入张量动态形状的双倍宽度和高度

问题描述

1 个解决方案

解决方案1
0 2021-05-27 11:53:50

tf.nn.conv2d_transpose() 输入张量动态形状的双倍宽度和高度

问题描述

1 个解决方案

解决方案1 0 2021-05-27 11:53:50

解决方案1
0 2021-05-27 11:53:50