如何拼合任意输入形状的数据？

Question

I'm building a CNN with Keras that predicts the coordinates of 13 keypoints in every image. 我正在使用Keras构建CNN，该CNN可以预测每个图像中13个关键点的坐标。 The images I have vary in input dimension so my input layer shape is (None, None, 3). 我输入的图像各不相同，因此我的输入图层形状为（无，无，3）。 I am using Inception Modules so I am using the Functional API. 我使用的是Inception模块，所以我使用的是功能API。 Now, while coding the last layers for my model, I encountered a problem. 现在，在为模型的最后一层编码时，我遇到了一个问题。 As far as I know, my output layer wil be a Dense(26) layer, since I will encode the x and y coordinates as a vector. 据我所知，我的输出层将是Dense（26）层，因为我会将x和y坐标编码为矢量。 I have trouble connecting the output layer with the preceeding Convolutional layers (because of tensor dimensions) 我很难将输出层与前面的卷积层连接起来（由于张量尺寸）

x = Input(None, None, 3)
stage_1 = Conv2D(26, (1, 1))(x)
stage_1 = Dropout(0.3)(stage_1)
stage_2 = Conv2D(512, (1, 1))(x)
stage_2 = Dropout(0.3)(stage_2)
stage_2 = Activation('relu')(stage_2)
x = concatenate([stage_1, stage_2])
x = Lambda(lambda i: K.batch_flatten(i))(x)
outputs = Dense(26)(x)

I tried including a Flatten Layer (but it is not compatible with arbitrary input shapes) and I've tried using K.batch_flatten() in a Lambda layer (which also did not work.) My question is: Is there a different way to get an output layer in a similar shape ((13,2) would also be fine, I just only found models online where the output layer is a Dense layer)? 我尝试过包括一个Flatten层（但它与任意输入形状不兼容），并且我曾尝试在Lambda层中使用K.batch_flatten（）（这也无法正常工作。）我的问题是：是否有其他方法可以获得具有相似形状的输出层（（（13,2）也可以，我只是在线上找到了其中输出层是密集层的模型）？ I also tried GlobalAveragePooling2d(), but this greatly decreased the accuracy of the model. 我还尝试了GlobalAveragePooling2d（），但这大大降低了模型的准确性。 Also, using a function to find the output shape did not work, see below 另外，使用函数查找输出形状无效，请参见下文

stage_1 = Conv2D(26, (1, 1))(x)
stage_1 = Dropout(0.3)(stage_1)
stage_2 = Conv2D(512, (1, 1))(x)
stage_2 = Dropout(0.3)(stage_2)
stage_2 = Activation('relu')(stage_2)
x = concatenate([stage_1, stage_2])

def output_shape_batch(tensor_shape):
    print(tensor_shape)
    return (batch_size, tensor_shape[1] * tensor_shape[2] * tensor_shape[3])

x = Lambda(lambda i: K.batch_flatten(i), output_shape=output_shape_batch)(x)
outputs = Dense(26)(x)

I expect the model to compile, but get TypeErrors The error is: TypeError: unsupported operand type(s) for *: 'NoneType' and 'NoneType' 我希望模型能够编译，但会得到TypeErrors错误是：TypeError：*：'NoneType'和'NoneType'不受支持的操作数类型

Answer 1

To the best of my knowlegde what you ask for is sadly not possible. 据我所知，您所要求的几乎是不可能的。 I'll first try to explain why and then give you some options for what you could do instead. 我将首先尝试解释原因，然后为您提供一些替代方法。

A neural network usually expects a fixed size input. 神经网络通常需要固定大小的输入。 Since every value of that input will be connected to a weight, the size of the input is needed for the calculation of the number of weights when initializing the model. 由于该输入的每个值都将连接到权重，因此在初始化模型时需要输入的大小来计算权重数。 Inputs of varying size are generally not possible, because this would change the number of weights and it is unclear what weights to choose/how to train them in this case. 通常不可能输入大小不同的输入，因为这会改变权重的数量，并且尚不清楚在这种情况下应选择哪种权重/如何对其进行训练。
Convolutional Layers are an exception for this. 卷积层是一个例外。 They use a fixed size kernel, thus the number of weights does not depend on the input size, which is why keras supports these 'variable size' inputs. 他们使用固定大小的内核，因此权重的数量不取决于输入大小，这就是keras支持这些“可变大小”输入的原因。 However the input size of a convolutional layer changes its output size. 但是，卷积层的输入大小会更改其输出大小。 This is not a problem if the next layer is also a convolotional layer, but when a dense layer is added the input size has to be fixed. 如果下一层也是对流层，这不是问题，但是当添加密集层时，输入大小必须固定。 Usually a Global Pooling layer is used to reduce a variable sized output to a fixed size. 通常，全局池化层用于将可变大小的输出减少到固定大小。 Then the dense layer can be added without a problem. 然后，可以毫无问题地添加致密层。
Since you want to predict coordinates in the image, global averaging will not be a good choice for you, because it destroys all the positional information. 由于您要预测图像中的坐标，因此全局平均对您而言不是一个好选择，因为它会破坏所有位置信息。 So here are two alternatives that you can consider: 因此，您可以考虑以下两种选择：

You could rescale all your images to the same size during preprocessing. 您可以在预处理期间将所有图像重新缩放为相同大小。
You could choose a maximum size for your input images and the add (zero) padding to your images to make them all the same size. 您可以为输入图像选择最大尺寸，并为图像选择添加（零）填充，以使它们都具有相同的尺寸。

如何拼合任意输入形状的数据？

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-03-25 11:33:38

如何拼合任意输入形状的数据？

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-03-25 11:33:38

解决方案1
1 已采纳 2019-03-25 11:33:38