ValueError：没有为任何变量提供梯度 - Tensorflow 2.0

Question

I'm trying to find the corresponding latent code of a MNIST image from a well-trained MNIST GAN model.我试图从训练有素的 MNIST GAN model 中找到 MNIST 图像的相应潜在代码。 What I plan to do is to apply gradient descent on the loss defined as the distance between the target and the generated sample.我打算做的是对定义为目标和生成样本之间距离的损失应用梯度下降。 As the generated sample gets closer to the target, the loss decreases, and the corresponding latent code is what I need.随着生成的样本越来越接近目标，损失减少，相应的潜在代码就是我所需要的。

Here is my code:这是我的代码：

import numpy as np
import tensorflow as tf
from tensorflow.keras import Model
from tensorflow.keras import Sequential
import tensorflow.keras.backend as K
from tensorflow.keras.datasets import mnist
from tensorflow.keras import layers
from tensorflow.keras.layers import *
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.optimizers import SGD
from tensorflow.keras.losses import MeanSquaredError
import random

### Load MNIST data
(data_x, _), _ = mnist.load_data()
data_x = np.reshape(np.asarray(data_x), [60000, 28*28]).astype(float)
train = data_x[:1024]
test = data_x[-10:]

### GAN setting
generator = Sequential([
    Dense(7 * 7 * 64, input_shape=[100]),
    BatchNormalization(),
    LeakyReLU(),
    Reshape([7, 7, 64]),
    UpSampling2D([2, 2]),
    Conv2DTranspose(64, [3, 3], padding='same'),
    BatchNormalization(),
    LeakyReLU(),
    UpSampling2D([2, 2]),
    Conv2DTranspose(1, [3, 3], padding='same', activation='sigmoid')
])

discriminator = Sequential([
    Conv2D(64, [3, 3], padding='same', input_shape=[28, 28, 1]),
    BatchNormalization(),
    LeakyReLU(),
    MaxPool2D([2, 2]),
    Conv2D(64, [3, 3], padding='same'),
    BatchNormalization(),
    LeakyReLU(),
    MaxPool2D([2, 2]),
    Flatten(),
    Dense(128),
    BatchNormalization(),
    LeakyReLU(),
    Dense(1, activation='sigmoid')
])

x_input = Input([28, 28, 1])
g_sample_input = Input([100])

log_clip = Lambda(lambda x: K.log(x + 1e-3))

sample_score = discriminator(generator(g_sample_input))

d_loss = (
    - log_clip(discriminator(x_input)) 
    - log_clip(1.0 - sample_score)
)
fit_discriminator = Model(inputs=[x_input, g_sample_input], outputs=d_loss)
fit_discriminator.add_loss(d_loss)
generator.trainable = False
for layer in generator.layers:
    if isinstance(layer, BatchNormalization):
        layer.trainable = True
fit_discriminator.compile(optimizer=Adam(0.001))
generator.trainable = True

g_loss = (
    - log_clip(sample_score)
)
fit_generator = Model(inputs=g_sample_input, outputs=g_loss)
fit_generator.add_loss(g_loss)
discriminator.trainable = False
for layer in discriminator.layers:
    if isinstance(layer, BatchNormalization):
        layer.trainable = True
fit_generator.compile(optimizer=Adam(0.001))
discriminator.trainable = True

### GAN training
train_x = train.reshape([-1, 28, 28, 1]) / 255
batch_size = 64
for i in range(10000):
    x = train_x[random.sample(range(len(train_x)), batch_size)]
    g_sample = np.random.uniform(-1, 1, [batch_size, 100])
    fit_discriminator.fit([K.constant(x), K.constant(g_sample)])
    fit_generator.fit(g_sample)
    
### Search for latent code
target = (test[0] / 255).reshape([28, 28])
mse = MeanSquaredError()
z = np.random.uniform(-1, 1, [1, 100])
z_t = tf.Variable(z, trainable=True)
opt = SGD(learning_rate=0.1)

for _ in range(10):
    loss_fn = lambda: mse(target,
                          generator(z_t.numpy())[0].numpy().reshape([28, 28]))

    opt.minimize(loss_fn, var_list=[z_t])

And I get this error:我得到这个错误：

ValueError: No gradients provided for any variable: ['Variable:0']. ValueError：没有为任何变量提供渐变：['Variable:0']。

It seems that Tensorflow cannot calculate the gradient from this kind of loss.似乎 Tensorflow 无法从这种损失中计算梯度。

Is there a way to calculate the gradient if the loss is derived from another model?如果损失来自另一个 model，有没有办法计算梯度？ Or is there a way to achieve my goal without calculating the gradient?或者有没有办法在不计算梯度的情况下实现我的目标？

Answer 1

I think I found the key point.我想我找到了关键点。

Tensorflow calculates gradients based on the graph. Tensorflow 根据图计算梯度。 So we should always put every operations inside the graph.所以我们应该总是把每一个操作都放在图中。

The error of my code is caused by the tensor-numpy transformation.我的代码的错误是由 tensor-numpy 转换引起的。 Since once we transform the tensor to a numpy array, it is brought out of the graph and Tensorflow can not track it anymore.因为一旦我们将张量转换为 numpy 数组，它就会被带出图表，Tensorflow 无法再跟踪它。

This is my new code and it runs well now:这是我的新代码，现在运行良好：

mse = MeanSquaredError()
target = (test[0] / 255).reshape([28, 28])
target_t = tf.convert_to_tensor(target)
z = np.random.uniform(-1, 1, [1, 100])
z_t = tf.Variable(z, trainable=True)
opt = SGD(learning_rate=0.1)

for _ in range(10):
    loss_fn = lambda: mse(target_t,
                          tf.reshape(tf.cast(generator(z_t), tf.float64), [28, 28]))
    opt.minimize(loss_fn, var_list=[z_t])

(only the last piece of the code is shown here, the other part remains the same.) （这里只显示最后一段代码，其他部分保持不变。）

ValueError：没有为任何变量提供梯度 - Tensorflow 2.0

问题描述

1 个解决方案

解决方案1
0 2021-03-13 12:13:51

ValueError：没有为任何变量提供梯度 - Tensorflow 2.0

问题描述

1 个解决方案

解决方案1 0 2021-03-13 12:13:51

解决方案1
0 2021-03-13 12:13:51