简体   繁体   English

Keras 顺序模型的可变输入形状

[英]Variable Input Shape for Keras Sequential Model

I have a Sequential model defined as the following:我有一个定义如下的顺序模型:

model = Sequential([
    BatchNormalization(axis=1,input_shape=(2,4)),
    Flatten(),
    Dense(256, activation='relu'),       
    BatchNormalization(),
    Dropout(0.1),
    Dense(2, activation='softmax')
])

I'd like to change this model to take inputs of variable shapes.我想更改此模型以获取可变形状的输入。 Specifically, the first dimension needs to be variable.具体来说,第一个维度需要可变。 Reading the Keras docs on specifying the input shape , I see that you can use None entries in the input_shape tuple where None indicates that any positive integer may be expected.阅读有关指定输入形状的 Keras 文档,我看到您可以在 input_shape 元组中使用 None 条目,其中 None 表示可能需要任何正整数。

With my existing model, if I change the input_shape from (2,4) to (None,4), I receive the error below:使用我现有的模型,如果我将 input_shape 从 (2,4) 更改为 (None,4),我会收到以下错误:

---> Dense(2, activation='softmax')
TypeError: an integer is required

I'm not positive, but I don't believe one can specify variable input shapes when the model contains a Flatten() layer.我不是肯定的,但我不相信当模型包含 Flatten() 层时可以指定可变输入形状。 I've read that Flatten() needs to know the input shape, and so variable input shapes are not compatible with Flatten().我读过 Flatten() 需要知道输入形状,因此可变输入形状与 Flatten() 不兼容。 If I remove the Flatten() layer, I receive the same error as above.如果我删除 Flatten() 层,我会收到与上面相同的错误。 I wouldn't expect this model to work without the Flatten() layer since I believe it is a requirement that the input is flattened before being passed to a Dense layer.我不希望这个模型在没有 Flatten() 层的情况下工作,因为我相信输入在传递到 Dense 层之前被展平是一个要求。

Given this, can anyone explain how I may be able to utilize variable input shapes?鉴于此,任何人都可以解释我如何能够利用可变输入形状吗? If the problem here is the Flatten() layer, what would be some ways to work around that given that inputs should be flattened before being passed to Dense layers?如果这里的问题是 Flatten() 层,那么在将输入传递到 Dense 层之前应将其展平的情况下,有哪些方法可以解决这个问题?

Thanks in advance for any advice.在此先感谢您的任何建议。

Edit: To give an example of a potential training set-- For the model shown above with input_shape=(2,4), the training set may look like the following, where each 2-d array in the set has shape (2,4):编辑:举一个潜在训练集的例子——对于上面显示的带有 input_shape=(2,4) 的模型,训练集可能如下所示,其中每个二维数组的形状为 (2, 4):

x_train = np.array([
         [[1, 1.02, 1.3, 0.9], [1.1, 1.2, 0.91, 0.99]], 
         [[1, 1.02, 1.3, 0.9], [1.1, 1.2, 0.91, 0.99]],
         [[1.01 ,1, 1.2, 1.2], [1.3, 1.2, 0.89, 0.98]]
        ])

For the data with input_shape = (None,4), where the shape of the first dimension of each data point can vary, and the second is fixed at 4, the training set may look like:对于 input_shape = (None,4) 的数据,其中每个数据点的第一个维度的形状可以变化,第二个维度固定为 4,训练集可能如下所示:

x_train = np.array([
         [[1, 1.02, 1.3, 0.9], [1.1, 1.2, 0.91, 0.99], [1.1, 1.2, 0.91, 0.99]], 
         [[1, 1.02, 1.3, 0.9], [1.1, 1.2, 0.91, 0.99]],
         [[1,1,1,1], [1.3, 1.2, 0.89, 0.98], [1,1,1,1], [1,1,1,1]]
        ])

x_train is having a varing dimension which will cause trouble at training stage. x_train 具有可变维度,这会在训练阶段引起问题。 Does it make a big deal to your data if wer pad extra zeros?如果我们填充额外的零,它会对您的数据产生重大影响吗? If not, find out the maximum of varying dimensoin and build your new array accordingly as illustrated below in the jupyter notebook:如果不是,请找出不同维度的最大值并相应地构建新数组,如下图所示,在 jupyter notebook 中: x_train 和 x_train2 的维度

填充零的方式

The input shape in Keras must be fixed a priori, maybe you should use PyTorch to solve this problem (dynamic input). Keras 中的输入形状必须是先验固定的,也许你应该使用 PyTorch 来解决这个问题(动态输入)。

To solve it in Keras, just find max lenght of your first dimension and then use padding (zero values) to complete other examples until they reach max lenght.要在 Keras 中解决它,只需找到第一个维度的最大长度,然后使用填充(零值)来完成其他示例,直到它们达到最大长度。

IF your expected output have a varying first dimension corresponding to the input , then the first dimension is the number of samples.如果您的预期输出具有与输入对应的不同第一维,则第一维是样本数。 In this case you may just ommit the input_shape parameter from BatchNormalization and add an input layer with the number of features在这种情况下,您可以只省略 BatchNormalization 中的 input_shape 参数并添加一个包含特征数量的输入层

model = Sequential([
  Input(4),
  BatchNormalization(axis=1),
  Flatten(),
  Dense(256, activation='relu'),       
  BatchNormalization(),
  Dropout(0.1),
  Dense(2, activation='softmax')
])

Since your BatchNormalization is defined on axis=1, that is, on the feature axis, you don't need to define the first dimension, which is the batch size.由于你的BatchNormalization定义在axis=1上,也就是特征轴上,所以你不需要定义第一个维度,也就是batch size。

Model Summary模型总结

model.summary()
>>>
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
batch_normalization (BatchNo (None, 4)                 16        
_________________________________________________________________
flatten (Flatten)            (None, 4)                 0         
_________________________________________________________________
dense (Dense)                (None, 256)               1280      
_________________________________________________________________
batch_normalization_1 (Batch (None, 256)               1024      
_________________________________________________________________
dropout (Dropout)            (None, 256)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 2)                 514       
=================================================================
Total params: 2,834
Trainable params: 2,314
Non-trainable params: 520

Then you can run it on your inputs然后你可以在你的输入上运行它

model.predict(x_train[0])
>>> array([[0.36491784, 0.63508224],
   [0.3834786 , 0.61652136],
   [0.3834786 , 0.61652136]], dtype=float32)

model.predict(x_train[1])
>>> array([[0.36491784, 0.63508224],
   [0.38347858, 0.61652136]], dtype=float32)

HOWEVER, if you want to generate outputs of shape (1,2) for each sample on your x_train , then each line in x_train is a single sample, in this case your Dense layer would need a variable number of parameters , which doesn't make sense to gradient descent.但是,如果您想为 x_train 上的每个样本生成形状 (1,2) 的输出,那么 x_train 中的每一行都是一个样本,在这种情况下,您的Dense 层将需要可变数量的参数,这不需要对梯度下降有意义。

In that case you may be looking for a Recursive Neural Network , which is a different beast, an example could be something like this在那种情况下,您可能正在寻找一个Recursive Neural Network ,这是一个不同的野兽,一个例子可能是这样的

model = tf.keras.Sequential()
model.add(Input((None, 4)))

model.add(LSTM(128))

model.add(Dense(2))

Model summary型号概要

model.summary()
>>>
Model: "sequential_5"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm_3 (LSTM)                (None, 128)               68096     
_________________________________________________________________
dense_4 (Dense)              (None, 2)                 258       
=================================================================
Total params: 68,354
Trainable params: 68,354
Non-trainable params: 0

To run it on your dataset just expand the first dimension of each sample, to say its a batch of size 1, that is, a single sample.要在您的数据集上运行它,只需扩展每个样本的第一个维度,也就是说它是一个大小为 1 的批次,即单个样本。

model.predict(np.expand_dims(x_train[0],0))
>>>
array([[0.19657324, 0.09764521]], dtype=float32)

model.predict(np.expand_dims(x_train[1],0))
>>>
array([[0.15233153, 0.08189206]], dtype=float32)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM