[英]No gradients provided in tensorflow (mean_squared_error)
我正在嘗試建立一個簡單的2個輸入神經元(+1個偏差)的網絡,將其轉換為1個輸出神經元來教它“和”功能。 它基於mnist-clissification的示例,因此對於該任務而言可能過於復雜,但是對我而言,這是關於此類蚊帳的一般結構,因此請不要說“您可以用numpy做到”。關於Tensorflow神經網絡 所以這是代碼:
import tensorflow as tf
import numpy as np
tf.logging.set_verbosity(tf.logging.INFO)
def model_fn(features, labels, mode):
input_layer = tf.reshape(features["x"], [-1, 2])
output_layer = tf.layers.dense(inputs=input_layer, units=1, activation=tf.nn.relu, name="output_layer")
if mode == tf.estimator.ModeKeys.PREDICT:
return tf.estimator.EstimatorSpec(mode=mode, predictions=output_layer)
loss = tf.losses.mean_squared_error(labels=labels, predictions=output_layer)
if mode == tf.estimator.ModeKeys.TRAIN:
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001)
train_op = optimizer.minimize(loss=loss, global_step=tf.train.get_global_step())
return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op)
eval_metrics_ops = {"accuracy": tf.metrics.accuracy(labels=labels, predictions=output_layer)}
return tf.estimator.EstimatorSpec(mode=mode, predictions=output_layer, loss=loss)
def main(unused_arg):
train_data = np.asarray(np.reshape([[0,0],[0,1],[1,0],[1,1]],[4,2]))
train_labels = np.asarray(np.reshape([0,0,0,1],[4,1]))
eval_data = train_data
eval_labels = train_labels
classifier = tf.estimator.Estimator(model_fn=model_fn, model_dir="/tmp/NN_AND")
tensors_to_log = {"The output:": "output_layer"}
logging_hook = tf.train.LoggingTensorHook(tensors=tensors_to_log,every_n_iter=10)
train_input_fn = tf.estimator.inputs.numpy_input_fn(x={"x":train_data}, y=train_labels, batch_size=10, num_epochs=None, shuffle=True)
classifier.train(input_fn=train_input_fn, steps=2000, hooks=[logging_hook])
eval_input_fn = tf.estimator.inputs.numpy_input_fn(x={"x":eval_data}, y=eval_labels, batch_size=1, shuffle=False)
eval_results = classifier.evaluate(input_fn=eval_input_fn)
print(eval_results)
if __name__ == "__main__":
tf.app.run()
我對您的代碼做了一些小的修改,使他們能夠學習and
功能:
1)將您的train_data
更改為float32表示形式。
train_data = np.asarray(np.reshape([[0,0],[0,1],[1,0],[1,1]],[4,2]), dtype=np.float32)`
2)從輸出層刪除relu激活-一般來說,不建議在輸出層中使用relus。 這可能導致死角,並且所有梯度將等於零,這反過來將使任何學習成為可能。
output_layer = tf.layers.dense(inputs=input_layer, units=1, activation=None, name="output_layer")
3)確保在eval_metrics_ops
中將結果取整,以便可以實際測量准確性:
eval_metrics_ops = {"accuracy": tf.metrics.accuracy(labels=labels, predictions=tf.round(output_layer))}
4)不要忘記將您定義的eval_metrics_ops
參數添加到估算器中:
return tf.estimator.EstimatorSpec(mode=mode, predictions=output_layer, loss=loss, eval_metric_ops=eval_metrics_ops)
另外,要記錄最后一層的輸出,您應該使用:
tensors_to_log = {"The output:": "output_layer/BiasAdd:0"}
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.