简体   繁体   中英

Variable Input Shape for Keras Sequential Model

I have a Sequential model defined as the following:

model = Sequential([
    BatchNormalization(axis=1,input_shape=(2,4)),
    Flatten(),
    Dense(256, activation='relu'),       
    BatchNormalization(),
    Dropout(0.1),
    Dense(2, activation='softmax')
])

I'd like to change this model to take inputs of variable shapes. Specifically, the first dimension needs to be variable. Reading the Keras docs on specifying the input shape , I see that you can use None entries in the input_shape tuple where None indicates that any positive integer may be expected.

With my existing model, if I change the input_shape from (2,4) to (None,4), I receive the error below:

---> Dense(2, activation='softmax')
TypeError: an integer is required

I'm not positive, but I don't believe one can specify variable input shapes when the model contains a Flatten() layer. I've read that Flatten() needs to know the input shape, and so variable input shapes are not compatible with Flatten(). If I remove the Flatten() layer, I receive the same error as above. I wouldn't expect this model to work without the Flatten() layer since I believe it is a requirement that the input is flattened before being passed to a Dense layer.

Given this, can anyone explain how I may be able to utilize variable input shapes? If the problem here is the Flatten() layer, what would be some ways to work around that given that inputs should be flattened before being passed to Dense layers?

Thanks in advance for any advice.

Edit: To give an example of a potential training set-- For the model shown above with input_shape=(2,4), the training set may look like the following, where each 2-d array in the set has shape (2,4):

x_train = np.array([
         [[1, 1.02, 1.3, 0.9], [1.1, 1.2, 0.91, 0.99]], 
         [[1, 1.02, 1.3, 0.9], [1.1, 1.2, 0.91, 0.99]],
         [[1.01 ,1, 1.2, 1.2], [1.3, 1.2, 0.89, 0.98]]
        ])

For the data with input_shape = (None,4), where the shape of the first dimension of each data point can vary, and the second is fixed at 4, the training set may look like:

x_train = np.array([
         [[1, 1.02, 1.3, 0.9], [1.1, 1.2, 0.91, 0.99], [1.1, 1.2, 0.91, 0.99]], 
         [[1, 1.02, 1.3, 0.9], [1.1, 1.2, 0.91, 0.99]],
         [[1,1,1,1], [1.3, 1.2, 0.89, 0.98], [1,1,1,1], [1,1,1,1]]
        ])

x_train is having a varing dimension which will cause trouble at training stage. Does it make a big deal to your data if wer pad extra zeros? If not, find out the maximum of varying dimensoin and build your new array accordingly as illustrated below in the jupyter notebook: x_train 和 x_train2 的维度

填充零的方式

The input shape in Keras must be fixed a priori, maybe you should use PyTorch to solve this problem (dynamic input).

To solve it in Keras, just find max lenght of your first dimension and then use padding (zero values) to complete other examples until they reach max lenght.

IF your expected output have a varying first dimension corresponding to the input , then the first dimension is the number of samples. In this case you may just ommit the input_shape parameter from BatchNormalization and add an input layer with the number of features

model = Sequential([
  Input(4),
  BatchNormalization(axis=1),
  Flatten(),
  Dense(256, activation='relu'),       
  BatchNormalization(),
  Dropout(0.1),
  Dense(2, activation='softmax')
])

Since your BatchNormalization is defined on axis=1, that is, on the feature axis, you don't need to define the first dimension, which is the batch size.

Model Summary

model.summary()
>>>
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
batch_normalization (BatchNo (None, 4)                 16        
_________________________________________________________________
flatten (Flatten)            (None, 4)                 0         
_________________________________________________________________
dense (Dense)                (None, 256)               1280      
_________________________________________________________________
batch_normalization_1 (Batch (None, 256)               1024      
_________________________________________________________________
dropout (Dropout)            (None, 256)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 2)                 514       
=================================================================
Total params: 2,834
Trainable params: 2,314
Non-trainable params: 520

Then you can run it on your inputs

model.predict(x_train[0])
>>> array([[0.36491784, 0.63508224],
   [0.3834786 , 0.61652136],
   [0.3834786 , 0.61652136]], dtype=float32)

model.predict(x_train[1])
>>> array([[0.36491784, 0.63508224],
   [0.38347858, 0.61652136]], dtype=float32)

HOWEVER, if you want to generate outputs of shape (1,2) for each sample on your x_train , then each line in x_train is a single sample, in this case your Dense layer would need a variable number of parameters , which doesn't make sense to gradient descent.

In that case you may be looking for a Recursive Neural Network , which is a different beast, an example could be something like this

model = tf.keras.Sequential()
model.add(Input((None, 4)))

model.add(LSTM(128))

model.add(Dense(2))

Model summary

model.summary()
>>>
Model: "sequential_5"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm_3 (LSTM)                (None, 128)               68096     
_________________________________________________________________
dense_4 (Dense)              (None, 2)                 258       
=================================================================
Total params: 68,354
Trainable params: 68,354
Non-trainable params: 0

To run it on your dataset just expand the first dimension of each sample, to say its a batch of size 1, that is, a single sample.

model.predict(np.expand_dims(x_train[0],0))
>>>
array([[0.19657324, 0.09764521]], dtype=float32)

model.predict(np.expand_dims(x_train[1],0))
>>>
array([[0.15233153, 0.08189206]], dtype=float32)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM