使用 GradientTape 訓練基本的 TensorFlow Model

Question

僅出於教育目的，我試圖在 TensorFlow 主頁上的基本訓練循環教程的基礎上創建一個簡單的神經網絡，用於對平面中的點進行分類。

因此，我將[0,1]x[0,1]中的一些點存儲在形狀為(250, 2, 1)的張量x中，並將相應的標簽(1. or 0.)存儲在形狀為y的張量中(250,1,1) 。 然后我做

import tensorflow as tf

w0 = tf.Variable(tf.random.normal([4,2]), name = 'w0')
w1 = tf.Variable(tf.random.normal([1,4]), name = 'w1')
b1 = tf.Variable(tf.zeros([4,1]), name = 'b1')
b2 = tf.Variable(tf.zeros([1,1]), name = 'b2')

loss = tf.keras.losses.CategoricalCrossentropy()

def forward(x):
  x0 = x
  z1 = tf.matmul(w0, x0) + b1
  x1 = tf.nn.relu(z1)
  z2 = tf.matmul(w1, x1) + b2
  x2 = tf.nn.sigmoid(z2)
  return x2

with tf.GradientTape() as t:
    current_loss = loss(y, forward(x))

gradients = t.gradient(current_loss, [b1, b2, w0, w1])

我得到的是一個預期形狀的張量列表，但只包含零。 有人給點建議嗎？

Answer 1

出現問題是因為標簽/預測沒有預期的形狀。 In particular, the loss function tf.keras.losses.CategoricalCrossentropy expects labels to be provided in a one-hot representation, but your labels and predictions have shape (250, 1, 1) and the behaviour of the loss function is unclear in this情況。 改用 tf.keras.losses.BinaryCrossentropy應該可以解決問題。

使用 GradientTape 訓練基本的 TensorFlow Model

問題描述

1 個解決方案

解決方案1
0 已采納 2021-04-01 21:49:38

使用 GradientTape 訓練基本的 TensorFlow Model

問題描述

1 個解決方案

解決方案1 0 已采納 2021-04-01 21:49:38

解決方案1
0 已采納 2021-04-01 21:49:38