I'm trying to make a toy neural.network that simply learns how to sort an array, it's just an experiment, but still I can't seem to make it function when using LSTMs, I'm probably missing something about the input/output shape requirements.
Here's the code, I'm writing everything from the data creation, you can skip it and go directly to the fit error at the end:
Data Creation:
n=20
m=10
test_r=0.1
val_r=0.1
int(n*(val_r+test_r))
X=np.random.rand(n,m)
Y=np.sort(X, axis=1)
val_n=int(n*(val_r))
test_n=int(n*(test_r))
X_train=X[:n-val_n-test_n]
X_val=X[n-val_n-test_n:n- test_n]
X_test=X[n-test_n:]
y_train=Y[:n-val_n-test_n]
y_val=Y[n-val_n-test_n:n- test_n]
y_test=Y[n-test_n:]
Here I build the model:
from keras.models import Sequential,Model
from keras.layers import Dense,Input,LSTM, Flatten
from keras.optimizers import Adam
from keras.callbacks import ModelCheckpoint
def model_build():
in_x = Input(shape=(m,1))
x = LSTM(128,activation='relu')(in_x)
x=Dense(m)(x)
model = Model(inputs=in_x, output=x)
return model
The model builds just fine:
model = model_build()
model.summary()
Model: "model_23"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_28 (InputLayer) (None, 10, 1) 0
_________________________________________________________________
lstm_28 (LSTM) (None, 128) 66560
_________________________________________________________________
dense_23 (Dense) (None, 10) 1290
=================================================================
Total params: 67,850
Trainable params: 67,850
Non-trainable params: 0
_________________________________________________________________
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:5: UserWarning: Update your `Model` call to the Keras 2 API: `Model(inputs=Tensor("in..., outputs=Tensor("de...)`
It also compiles without problems:
model= model_build()
opt = Adam(lr=0.0001, decay=1e-5)
chkpt = ModelCheckpoint(filepath='/content/drive/My Drive/CoLab/h5py/best_forecast2mse.h5',monitor='mean_squared_error', save_best_only=True, save_weights_only=True)
chkpt2 = ModelCheckpoint(filepath='/content/drive/My Drive/CoLab/h5py/best_forecast2mae.h5',monitor='mean_absolute_error', save_best_only=True, save_weights_only=True)
callbacks_list=[chkpt,chkpt2]
model.compile(loss='mse', metrics=['mse','mae'],optimizer=opt)
I reshape the data in the format suitable for LSTMs:
y_train=np.reshape(y_train,np.shape(y_train)+(1,) )
X_train=np.reshape(X_train,np.shape(X_train)+(1,) )
It's in the shape of (n_samples,n_steps,n_variables) so it should work:
np.shape(X_train), np.shape(y_train)
((16, 10, 1), (16, 10, 1))
I try to fit:
H=model.fit(X_train,y_train, validation_data=(X_val,y_val), callbacks=callbacks_list, epochs=100)
It gives me this error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-104-f68ddc068cd4> in <module>()
----> 1 H=model.fit(X_train,y_train, validation_data=(X_val,y_val), callbacks=callbacks_list, epochs=100)
2 frames
/usr/local/lib/python3.6/dist-packages/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_freq, max_queue_size, workers, use_multiprocessing, **kwargs)
1152 sample_weight=sample_weight,
1153 class_weight=class_weight,
-> 1154 batch_size=batch_size)
1155
1156 # Prepare validation data.
/usr/local/lib/python3.6/dist-packages/keras/engine/training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, check_array_lengths, batch_size)
619 feed_output_shapes,
620 check_batch_axis=False, # Don't enforce the batch size.
--> 621 exception_prefix='target')
622
623 # Generate sample-wise weight values given the `sample_weight` and
/usr/local/lib/python3.6/dist-packages/keras/engine/training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
133 ': expected ' + names[i] + ' to have ' +
134 str(len(shape)) + ' dimensions, but got array '
--> 135 'with shape ' + str(data_shape))
136 if not check_batch_axis:
137 data_shape = data_shape[1:]
ValueError: Error when checking target: expected dense_24 to have 2 dimensions, but got array with shape (16, 10, 1)
Your model produces an output that has a shape of (batch_size, 10)
. You can tell this from the last line of model.summary()
. This output needs to have the same shape with the target output so that it can be compared in the loss function. Instead you pass it a target that has a shape of (16, 10, 1)
.
In order for this to work you need your target output to have a shape of (16, 10)
. The shape that you try to make (ie (n_samples, n_steps, n_variables)
) is only applicable for the input, not the output .
To make this work just remove this line from your code:
y_train = np.reshape(y_train, np.shape(y_train) + (1,))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.