[英]tf.GradientTape with outer product returns None
I am trying to postprocess my model's prediction before computing the loss function, since my true data (y_train) is the outer product of the NN output.我试图在计算损失函数之前对模型的预测进行后处理,因为我的真实数据 (y_train) 是 NN 输出的外积。 I have followed these steps:我已按照以下步骤操作:
nX = 201
nT = 101
nNNout = nX+nT
nBatch = 32
NNout = np.random.rand(nBatch, nNNout)
f = NNout[:, :nX]
g = NNout[:,nX:]
test = np.empty([nBatch, nX*nT])
for i in range(nBatch):
test[i,:] = np.outer(f[i,:], g[i,:]).flatten('F')
where the NN output contains f and g.其中 NN 输出包含 f 和 g。 What I actually need is the vectorised version of the outer product of f and g for each batch instance.我真正需要的是每个批处理实例的 f 和 g 外积的矢量化版本。
test2 = tf.Variable([tf.reshape(tf.transpose(tf.tensordot(f[i,:],g[i,:], axes=0)),[nX*nT]) for i in range(nBatch)])
which I have checked that is correct and that outputs the same values than in step 1.我已经检查过它是正确的,并且输出的值与步骤 1 中的值相同。
n_epochs = 20
batch_size = 32
n_steps = len(x_train) // batch_size
optimizer = keras.optimizers.Nadam(learning_rate=0.01)
loss_fn = keras.losses.mean_squared_error
mean_loss = keras.metrics.Mean()
metrics = [keras.metrics.MeanAbsoluteError()]
# ------------ Training ------------
for epoch in range(1, n_epochs + 1):
print("Epoch {}/{}".format(epoch, n_epochs))
for step in range(1, n_steps + 1):
X_batch, y_batch = random_batch(x_train, np.array(y_train))
with tf.GradientTape() as tape:
y_pred = model(X_batch, training=True)
u_pred = tf.Variable([tf.reshape(tf.transpose(tf.tensordot(y_pred[i, :nX], y_pred[i, nX:], axes=0)), [nX * nT]) for i in
range(batch_size)])
main_loss = tf.reduce_mean(loss_fn(y_batch, u_pred))
loss = tf.add_n([main_loss] + model.losses)
gradients = tape.gradient(loss, model.trainable_variables)
My main issue is that gradients become a list of Nones when I add the operation.我的主要问题是,当我添加操作时,梯度变成了一个 None 列表。 If I simply compute the loss function with my model's prediction (y_pred) the code is able to compute the gradients.如果我简单地使用模型的预测 (y_pred) 计算损失函数,则代码能够计算梯度。
Could you please help me find the error I am making here?你能帮我找出我在这里犯的错误吗?
You are creating a new (trainable) variable in u_pred, thus breaking any dependency of u_pred on y_pred.您正在 u_pred 中创建一个新的(可训练的)变量,从而打破 u_pred 对 y_pred 的任何依赖。 The reason why value matches is because you initialise your new variable with the prediction, but it has no functional dependency on each other anymore, there are no gradients flowing.为什么值相匹配的原因是因为你与预测初始化你的新的变量,但对对方没有功能依赖关系了,有没有流动梯度。
I am guessing that you did that because you needed a tf.Tensor and not a list, and you ended up with type errors.我猜你这样做是因为你需要一个 tf.Tensor 而不是一个列表,你最终遇到了类型错误。 You probably want to use something among the lines of tf.concatenate
and not tf.Variable
for that.您可能希望在tf.concatenate
行中使用某些内容,而不是tf.Variable
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.