简体   繁体   中英

Keras nn loss is inf/nan

I am trying to develop a hello world application for NN in keras(tensorflow). I want to create a basic model that will fit the equation y=0.5+0.5x , I wrote this code

import tensorflow as tf
import numpy as np
from tensorflow import keras
TRAINING_DATA_SIZE = 20
model = keras.Sequential([keras.layers.Dense(units=1, input_shape=[1])]) 
model.compile(optimizer='sgd', loss='mean_squared_error')
xs = np.array(range(TRAINING_DATA_SIZE), dtype=float)
ys = np.array([(0.5 + 0.5 * i) for i in range(TRAINING_DATA_SIZE)], dtype=float)
model.fit(xs, ys, epochs=500)
print(model.predict([7.0]))

however if I TRAINING_DATA_SIZE > 10 then the loss is INF after a 100 epochs and it becomes nan after 200 epochs.

What cause this? why cant I give a large data set to train on? thanks

With the default sgd learning rate the optimiser is totally overshooting.

Try:

model.compile(optimizer='adam', loss='mean_squared_error')

You have a single weight and bias. With a little tweaking of the learning rates this model converge in a few iterations.

For instance:

TRAINING_DATA_SIZE = 200

opt=keras.optimizers.Adam(lr=0.1)
model.compile(opt, loss='mean_squared_error')

model.fit(xs, ys, epochs=50, validation_split=0.2, verbose=False)

print('w, b:', model.layers[0].get_weights())
print(model.predict([7.0]))

shows:

w, b: [array([[0.5000057]], dtype=float32), array([0.49888334], dtype=float32)]

[[3.9989233]]

Which seem reasonable guesses for the target weight and bias.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM