Tensorflow's loss function returns NAN after changing RNN to LSTM cell

Question

I am training a model to predict Time Series using an RNN model. This model is trained without any issue. Here's the original code:

tf.reset_default_graph()

num_inputs = 1
num_neurons = 100
num_outputs = 1
learning_rate = 0.0001
num_train_iterations = 2000
batch_size = 1

X = tf.placeholder(tf.float32, [None, time_steps-1, num_inputs])
y = tf.placeholder(tf.float32, [None, time_steps-1, num_outputs])
cell = tf.contrib.rnn.OutputProjectionWrapper(
    tf.contrib.rnn.BasicRNNCell(num_units=num_neurons, activation=tf.nn.relu),
    output_size=num_outputs)
outputs, states = tf.nn.dynamic_rnn(cell, X, dtype=tf.float32)
loss = tf.reduce_mean(tf.square(outputs - y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)
train = optimizer.minimize(loss)
init = tf.global_variables_initializer()
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.75)

with tf.Session(config=tf.ConfigProto(gpu_options=gpu_options)) as sess:
    sess.run(init)
for iteration in range(num_train_iterations):

    elx,ely = next_batch(training_data, time_steps)
    sess.run(train, feed_dict={X: elx, y: ely})

    if iteration % 100 == 0:

        mse = loss.eval(feed_dict={X: elx, y: ely})
        print(iteration, "\tMSE:", mse)

The problem comes when I change tf.contrib.rnn.BasicRNNCell to tf.contrib.rnn.BasicLSTMCell , there's a huge slowdown in speed and the loss function ( MSE variable becomes NAN ). My best bet is that MSE is the incorrect loss function and that I should try cross entropy. I searched for similar code and found that tf.nn.softmax_cross_entropy_with_logits() could be the solution but still don't understand how to implement it in my problem.

Answer 1

Usually the "NAN" occurs when your gradients blow up. Here is some code for tf.softmax. Have a try.

#Output Layer
logit = tf.add(tf.matmul(H1,w2),b2)
cross_entropy = 
tf.nn.softmax_cross_entropy_with_logits(logits=logit,labels=Y)

#Cost
cost = (tf.reduce_mean(cross_entropy))

#Optimizer
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

#Prediction
y_pred = tf.nn.softmax(logit)

pred = tf.argmax(y_pred, axis=1 )

Tensorflow's loss function returns NAN after changing RNN to LSTM cell

Question

1 answers

solution1
0 2018-05-23 12:21:05

Tensorflow's loss function returns NAN after changing RNN to LSTM cell

Question

1 answers

solution1 0 2018-05-23 12:21:05

solution1
0 2018-05-23 12:21:05