Tensorflow-沒有為任何變量提供梯度

Question

我正在Jupyter上嘗試一些代碼，並一直卡在這里。 如果我刪除以“ optimizer = ...”開頭的行以及對該行的所有引用，則實際上一切正常。 但是，如果我將此行放入代碼中，則會出現錯誤。

我沒有在此處粘貼所有其他功能，以使代碼的大小保持可讀性。 我希望更多有經驗的人可以立即看到問題所在。

請注意，輸入層，2個隱藏層和輸出層中有5、4、3和2個單位。

碼：

tf.reset_default_graph()

num_units_in_layers = [5,4,3,2]

X = tf.placeholder(shape=[5, 3], dtype=tf.float32)
Y = tf.placeholder(shape=[2, 3], dtype=tf.float32)
parameters = initialize_layer_parameters(num_units_in_layers)
init = tf.global_variables_initializer() 

my_sess = tf.Session()
my_sess.run(init)
ZL = forward_propagation_with_relu(X, num_units_in_layers, parameters, my_sess)
#my_sess.run(parameters)  # Do I need to run this? Or is it obsolete?

cost = compute_cost(ZL, Y, my_sess, parameters, batch_size=3, lambd=0.05)
optimizer =  tf.train.AdamOptimizer(learning_rate = 0.001).minimize(cost)
_ , minibatch_cost = my_sess.run([optimizer, cost], 
                                 feed_dict={X: minibatch_X, 
                                            Y: minibatch_Y})

print(minibatch_cost)
my_sess.close()

錯誤：

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-321-135b9fc18268> in <module>()
     16 cost = compute_cost(ZL, Y, my_sess, parameters, 3, 0.05)
     17 
---> 18 optimizer =  tf.train.AdamOptimizer(learning_rate = 0.001).minimize(cost)
     19 _ , minibatch_cost = my_sess.run([optimizer, cost], 
     20                                  feed_dict={X: minibatch_X, 

~/.local/lib/python3.5/site-packages/tensorflow/python/training/optimizer.py in minimize(self, loss, global_step, var_list, gate_gradients, aggregation_method, colocate_gradients_with_ops, name, grad_loss)
    362           "No gradients provided for any variable, check your graph for ops"
    363           " that do not support gradients, between variables %s and loss %s." %
--> 364           ([str(v) for _, v in grads_and_vars], loss))
    365 
    366     return self.apply_gradients(grads_and_vars, global_step=global_step,

ValueError: No gradients provided for any variable, check your graph for ops that do not support gradients, between variables ["<tf.Variable 'weights/W1:0' shape=(4, 5) dtype=float32_ref>", "<tf.Variable 'biases/b1:0' shape=(4, 1) dtype=float32_ref>", "<tf.Variable 'weights/W2:0' shape=(3, 4) dtype=float32_ref>", "<tf.Variable 'biases/b2:0' shape=(3, 1) dtype=float32_ref>", "<tf.Variable 'weights/W3:0' shape=(2, 3) dtype=float32_ref>", "<tf.Variable 'biases/b3:0' shape=(2, 1) dtype=float32_ref>"] and loss Tensor("Add_3:0", shape=(), dtype=float32).

請注意，如果我跑步

print(tf.trainable_variables())

就在“ optimizer = ...”行之前，我實際上在那里看到了我的可訓練變量。

hts/W1:0' shape=(4, 5) dtype=float32_ref>, <tf.Variable 'biases/b1:0' shape=(4, 1) dtype=float32_ref>, <tf.Variable 'weights/W2:0' shape=(3, 4) dtype=float32_ref>, <tf.Variable 'biases/b2:0' shape=(3, 1) dtype=float32_ref>, <tf.Variable 'weights/W3:0' shape=(2, 3) dtype=float32_ref>, <tf.Variable 'biases/b3:0' shape=(2, 1) dtype=float32_ref>]

會有人知道可能是什么問題嗎？

編輯和添加更多信息：如果您想了解如何創建和初始化參數，請參見以下代碼。 可能這部分內容有誤，但我看不到..

def get_nn_parameter(variable_scope, variable_name, dim1, dim2):
  with tf.variable_scope(variable_scope, reuse=tf.AUTO_REUSE):
    v = tf.get_variable(variable_name, 
                        [dim1, dim2], 
                        trainable=True, 
                        initializer = tf.contrib.layers.xavier_initializer())
  return v


def initialize_layer_parameters(num_units_in_layers):
    parameters = {}
    L = len(num_units_in_layers)

    for i in range (1, L):
        temp_weight = get_nn_parameter("weights",
                                       "W"+str(i), 
                                       num_units_in_layers[i], 
                                       num_units_in_layers[i-1])
        parameters.update({"W" + str(i) : temp_weight})  
        temp_bias = get_nn_parameter("biases",
                                     "b"+str(i), 
                                     num_units_in_layers[i], 
                                     1)
        parameters.update({"b" + str(i) : temp_bias})  

    return parameters

＃

附錄

我知道了 我沒有在這里寫一個單獨的答案，而是在這里添加了正確的代碼版本。

（以下大衛的回答很有幫助。）

我只是將my_sess作為我的compute_cost函數的參數刪除了。 （我以前無法使其工作，但似乎根本不需要。）而且我還在主函數中對語句進行了重新排序，以正確的順序調用事物。

這是我的費用函數的工作版本以及如何調用它：

tf.reset_default_graph()

num_units_in_layers = [5,4,3,2]

X = tf.placeholder(shape=[5, 3], dtype=tf.float32)
Y = tf.placeholder(shape=[2, 3], dtype=tf.float32)
parameters = initialize_layer_parameters(num_units_in_layers)

my_sess = tf.Session()
ZL = forward_propagation_with_relu(X, num_units_in_layers, parameters)

cost = compute_cost(ZL, Y, parameters, 3, 0.05)
optimizer =  tf.train.AdamOptimizer(learning_rate = 0.001).minimize(cost)
init = tf.global_variables_initializer() 

my_sess.run(init)
_ , minibatch_cost = my_sess.run([optimizer, cost], 
                                 feed_dict={X: [[-1.,4.,-7.],[2.,6.,2.],[3.,3.,9.],[8.,4.,4.],[5.,3.,5.]], 
                                            Y: [[0.6, 0., 0.3], [0.4, 0., 0.7]]})


print(minibatch_cost)

my_sess.close()

這是主要函數，我將其稱為compute_cost（..）函數：

 tf.reset_default_graph() num_units_in_layers = [5,4,3,2] X = tf.placeholder(shape=[5, 3], dtype=tf.float32) Y = tf.placeholder(shape=[2, 3], dtype=tf.float32) parameters = initialize_layer_parameters(num_units_in_layers) my_sess = tf.Session() ZL = forward_propagation_with_relu(X, num_units_in_layers, parameters) cost = compute_cost(ZL, Y, parameters, 3, 0.05) optimizer = tf.train.AdamOptimizer(learning_rate = 0.001).minimize(cost) init = tf.global_variables_initializer() my_sess.run(init) _ , minibatch_cost = my_sess.run([optimizer, cost], feed_dict={X: [[-1.,4.,-7.],[2.,6.,2.],[3.,3.,9.],[8.,4.,4.],[5.,3.,5.]], Y: [[0.6, 0., 0.3], [0.4, 0., 0.7]]}) print(minibatch_cost) my_sess.close()

Answer 1

我確定99.9％的人錯誤地創建了成本函數。

cost = compute_cost(ZL, Y, my_sess, parameters, batch_size=3, lambd=0.05)

您的成本函數應該是張量。 您正在將會話傳遞給cost函數，看起來它實際上是在嘗試運行嚴重錯誤的tensorflow會話。

然后，稍后將compute_cost的結果compute_cost給最小化器。

這是關於張量流的常見誤解。

Tensorflow是聲明性編程范例，這意味着您首先聲明要運行的所有操作，然后再傳遞數據並運行它。

重構代碼以嚴格遵循以下最佳實踐：

（1）創建一個build_graph()函數，該函數中的所有數學運算都應放置在該函數中。 您應該定義成本函數和網絡的所有層。 返回optimize.minimize()訓練操作（以及您可能希望獲取的其他任何操作，例如准確性）。

（2）現在創建一個會話。

（3）在這之后，如果您感覺需要做錯事，請不要再創建任何tensorflow操作或變量。

（4）呼叫sess.run您train_op，並通過在占位符數據傳遞feed_dict 。

這是有關如何構造代碼的簡單示例：

https://github.com/aymericdamien/TensorFlow-Examples/blob/master/notebooks/3_NeuralNetworks/neural_network_raw.ipynb

總的來說，aymericdamien提出了很多很好的例子，我強烈建議您復習它們以了解張量流的基本知識。

Tensorflow-沒有為任何變量提供梯度

問題描述

1 個解決方案

解決方案1
1 已采納 2018-03-15 01:56:53

Tensorflow-沒有為任何變量提供梯度

問題描述

1 個解決方案

解決方案1 1 已采納 2018-03-15 01:56:53

解決方案1
1 已采納 2018-03-15 01:56:53