简体   繁体   中英

Keras model loss function returning nan

I am training an autoencoder in keras which is defined like this:

model = Sequential()
model.add(LSTM(100, activation='relu', input_shape=(430, 3)))
model.add(RepeatVector(430))
model.add(LSTM(100, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(3)))
model.compile(optimizer='adam', loss='mean_squared_error', metrics=['accuracy'])
print(model.summary())

context_paths = loadFile()
X_train, X_test = train_test_split(context_paths, test_size=0.20)

print('Fitting model.')
history = model.fit(X_train, X_train, epochs=1, batch_size=8, verbose=1, shuffle=True, validation_data=(X_test, X_test))

predict_sample = X_train[0].reshape((1, 430, 3))

predict_output = model.predict(predict_sample, verbose=0)
print(predict_output[0, :, 0])

This code doesn't give any errors, but when I run it, the loss is nan. I have checked some questions on SO and found that this problem occurs when:

  • nan or infinite values are present --> I checked my input data with numpy.isnan(myarray).any() , which returned False , so I also did numpy.isfinite(myarray).any() which returned True , so I assume my data is allright
  • batch size is too big --> I reduced from 32 to 8, didn't help much
  • layer size is too big --> I reduced from 100 to 24, didn't help much

Here is picture of the first few batches: 在此处输入图片说明

Here the loss is gigantic, but I am not sure what is causing it. The range of number in my dataset are reaching the limits of int32. Also my data is padded with 0's.

You clearly have huge range data. You're overflowing everthing, as you yourself observed in your range:

The range of number in my dataset are reaching the limits of int32

Normalize your data before using it in a model.

The correct verification for infinite values shoudl be:

numpy.isfinite(myarray).all()

You can try a transform for a 0 to 1 range (needs to convert to float first):

xMax = x_train.max()
xMin = x_train.min()
xRange = xMax - xMin

x_train = (x_train - xMin) / xRange
x_test = (x_test - xMin) / xRange

Do the same with y.

You could try a Z-transform too:

xMean = x_train.mean()
xStd = x_train.std()

x_train = (x_train - xMean) / xStd
x_test = (x_test - xMean) / xStd

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM