在 TensorFlow 1.15 中的自定義訓練循環內將張量轉換為 aa numpy 數組

Question

我正在 TensorFlow 中訓練 model 並且我想在每批之后打印 model 的損失。 我正在使用一個看起來像的自定義訓練循環

import tensorflow as tf
from tensorflow.keras.losses import cosine_similarity
from tensorflow.keras.optimizers import Adam


model = get_model(**model_params)
g = get_generator(**generator_params)

optimizer = Adam()
epochs = 10

for epoch in range(epochs):
   for i in range(len(g)):
      with tf.GradientTape() as tape:
         x,y = g[i]
         model_prediction = model(x)
         loss = cosine_similarity(y, model_prediction)
         gradients = tape.gradient(loss, model.trainable_weights)
         optimizer.apply_gradients(zip(gradients, model.trainable_weights))
         
         print(f"Batch {i}/{len(g)}. Loss: {loss.eval(session=tf.Session()): .4f}")

由於損失是一個張量，以便能夠實際看到我需要將其轉換為 NumPy 數組的值（計划不是打印數組，但一旦我可以將張量轉換為數組，這解決了我的問題）。 不幸的是，我一直在嘗試的方式導致以下錯誤

Failed precondition: Error while reading resource variable dense_5/kernel from Container: localhost. 
This could mean that the variable was uninitialized. Not found: Container localhost does not exist.`

我也嘗試通過添加來編輯循環


for epoch in range(epochs):
   for i in range(len(g)):
      with tf.GradientTape() as tape, tf.Session() as session:
         # training code
         loss_numpy = session.run(loss)

這給了我與上面相同的錯誤，並且還嘗試在每個訓練步驟初始化全局變量


for epoch in range(epochs):
   for i in range(len(g)):
      with tf.GradientTape() as tape, tf.Session() as session:
         # training code
         init = tf.global_variables_initializer()
         session.run(init)
         print(f"Batch {i}/{len(g)}. Loss: {session.run(loss): .4f}")

這不會引發錯誤，但速度很慢，並且會輸出許多我想避免的其他與 Nvidia 相關的內容。

有沒有辦法避免錯誤但不必在每一步都進行變量初始化。 或者也許有一種方法可以讓與 Nvidia 相關的 output 靜音。

Answer 1

Looking at the code and the error, my guess is that you're not juggling the scope correctly with respect to the TensorFlow session Keras needs and uses.

一種選擇是它沒有被正確初始化。 這是可能的，因為您沒有使用處理該問題的普通 Keras 培訓制度。 或者它可能正在部分完成，然后，因為您使用的是with運算符，當with內的塊完成時，session 正在關閉。 這就是 Python with的用途。

我自己沒有嘗試過，但我的直覺是，如果你在開始搞亂訓練之前自己實例化一個 session，然后在整個過程中保持 session，這應該可以工作。

順便說一句，您實際上不需要將損失轉換為 NumPy object 來打印或以其他方式檢查它。 如果您直接使用 TensorFlow 進行數學運算並避免進行轉換，您可能會更輕松（在速度和穩定性方面）。

在 TensorFlow 1.15 中的自定義訓練循環內將張量轉換為 aa numpy 數組

問題描述

1 個解決方案

解決方案1
1 已采納 2021-03-11 16:45:39

在 TensorFlow 1.15 中的自定義訓練循環內將張量轉換為 aa numpy 數組

問題描述

1 個解決方案

解決方案1 1 已采納 2021-03-11 16:45:39

解決方案1
1 已采納 2021-03-11 16:45:39