简体   繁体   English

tf.GradientTape 不返回渐变

[英]tf.GradientTape doesn't return gradients

I'm using Tensorflow 2.1.0 with Python 3.7.7, in an Anaconda 3 environment running on Windows 7 64bits. I'm using Tensorflow 2.1.0 with Python 3.7.7, in an Anaconda 3 environment running on Windows 7 64bits.

This is my network:这是我的网络:

import tensorflow as tf
from tensorflow import keras
from tensorflow.python.keras.models import Model
from tensorflow.python.keras.layers import Input, Dense, Conv2D, UpSampling2D, MaxPooling2D, Flatten, ZeroPadding2D
from tensorflow.python.keras.optimizers import Adam

def vgg16_encoder_decoder(input_size = (200,200,1)):
    #################################
    # Encoder
    #################################
    inputs = Input(input_size, name = 'input')

    conv1 = Conv2D(64, (3, 3), activation = 'relu', padding = 'same', name ='conv1_1')(inputs)
    conv1 = Conv2D(64, (3, 3), activation = 'relu', padding = 'same', name ='conv1_2')(conv1)
    pool1 = MaxPooling2D(pool_size = (2,2), strides = (2,2), name = 'pool_1')(conv1)

    conv2 = Conv2D(128, (3, 3), activation = 'relu', padding = 'same', name ='conv2_1')(pool1)
    conv2 = Conv2D(128, (3, 3), activation = 'relu', padding = 'same', name ='conv2_2')(conv2)
    pool2 = MaxPooling2D(pool_size = (2,2), strides = (2,2), name = 'pool_2')(conv2)

    conv3 = Conv2D(256, (3, 3), activation = 'relu', padding = 'same', name ='conv3_1')(pool2)
    conv3 = Conv2D(256, (3, 3), activation = 'relu', padding = 'same', name ='conv3_2')(conv3)
    conv3 = Conv2D(256, (3, 3), activation = 'relu', padding = 'same', name ='conv3_3')(conv3)
    pool3 = MaxPooling2D(pool_size = (2,2), strides = (2,2), name = 'pool_3')(conv3)

    conv4 = Conv2D(512, (3, 3), activation = 'relu', padding = 'same', name ='conv4_1')(pool3)
    conv4 = Conv2D(512, (3, 3), activation = 'relu', padding = 'same', name ='conv4_2')(conv4)
    conv4 = Conv2D(512, (3, 3), activation = 'relu', padding = 'same', name ='conv4_3')(conv4)
    pool4 = MaxPooling2D(pool_size = (2,2), strides = (2,2), name = 'pool_4')(conv4)

    conv5 = Conv2D(512, (3, 3), activation = 'relu', padding = 'same', name ='conv5_1')(pool4)
    conv5 = Conv2D(512, (3, 3), activation = 'relu', padding = 'same', name ='conv5_2')(conv5)
    conv5 = Conv2D(512, (3, 3), activation = 'relu', padding = 'same', name ='conv5_3')(conv5)
    pool5 = MaxPooling2D(pool_size = (2,2), strides = (2,2), name = 'pool_5')(conv5)

    #################################
    # Decoder
    #################################
    #conv1 = Conv2DTranspose(512, (2, 2), strides = 2, name = 'conv1')(pool5)

    upsp1 = UpSampling2D(size = (2,2), name = 'upsp1')(pool5)
    conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', name = 'conv6_1')(upsp1)
    conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', name = 'conv6_2')(conv6)
    conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', name = 'conv6_3')(conv6)

    upsp2 = UpSampling2D(size = (2,2), name = 'upsp2')(conv6)
    conv7 = Conv2D(512, 3, activation = 'relu', padding = 'same', name = 'conv7_1')(upsp2)
    conv7 = Conv2D(512, 3, activation = 'relu', padding = 'same', name = 'conv7_2')(conv7)
    conv7 = Conv2D(512, 3, activation = 'relu', padding = 'same', name = 'conv7_3')(conv7)
    zero1 = ZeroPadding2D(padding =  ((1, 0), (1, 0)), data_format = 'channels_last', name='zero1')(conv7)

    upsp3 = UpSampling2D(size = (2,2), name = 'upsp3')(zero1)
    conv8 = Conv2D(256, 3, activation = 'relu', padding = 'same', name = 'conv8_1')(upsp3)
    conv8 = Conv2D(256, 3, activation = 'relu', padding = 'same', name = 'conv8_2')(conv8)
    conv8 = Conv2D(256, 3, activation = 'relu', padding = 'same', name = 'conv8_3')(conv8)

    upsp4 = UpSampling2D(size = (2,2), name = 'upsp4')(conv8)
    conv9 = Conv2D(128, 3, activation = 'relu', padding = 'same', name = 'conv9_1')(upsp4)
    conv9 = Conv2D(128, 3, activation = 'relu', padding = 'same', name = 'conv9_2')(conv9)

    upsp5 = UpSampling2D(size = (2,2), name = 'upsp5')(conv9)
    conv10 = Conv2D(64, 3, activation = 'relu', padding = 'same', name = 'conv10_1')(upsp5)
    conv10 = Conv2D(64, 3, activation = 'relu', padding = 'same', name = 'conv10_2')(conv10)

    conv11 = Conv2D(1, 3, activation = 'relu', padding = 'same', name = 'conv11')(conv10)

    model = Model(inputs = inputs, outputs = conv11, name = 'vgg-16_encoder_decoder')

    return model

And this is the code that runs the network:这是运行网络的代码:

import tensorflow as tf
import numpy as np

import viacognita.utils as utils
import viacognita.vgg_16 as vgg16

# Global variables
#-------------------------------------------------------------------------------
image_rows = 200
image_cols = 200
channels = 1

# Load a preprocess  datasets
D = ... # It's a Numpy array with shape is (960, 2, 200, 200, 1)

# Model's functions definitions.
# ------------------------------------------------------------------------------
def loss(model, x, y):
  y_ = model(x)
  return tf.convert_to_tensor(np.linalg.norm(y - y_), dtype=tf.float32)

def grad(model, inputs, targets):
    with tf.GradientTape() as tape:
        #tape.watch(model.trainable_variables)
        loss_value = loss(model, inputs, targets)

    return loss_value, tape.gradient(loss_value, model.trainable_variables)

# Model, optimizer and learner.
# ------------------------------------------------------------------------------

# Get the model.
model = vgg16.vgg16_encoder_decoder((image_rows, image_rows, channels))

# Lets set up the optimizer.
optimizer = tf.keras.optimizers.SGD(learning_rate=0.01)

features = D[:,0,:]
labels = D[:,1,:]

print("D shape: ", D.shape) # Shape is (960, 2, 200, 200, 1)
print("Features shape: ", features.shape) # Shape is (960, 200, 200, 1)
print("Labels shape: ", labels.shape) # Shape is (960, 200, 200, 1)

print(features[0, :].shape)  # Shape is (200, 200, 1)
print(labels[0,:].shape) # Shape is (200, 200, 1)

# We'll use this to calculate a single optimization step:
loss_value, grads = grad(model, tf.convert_to_tensor(features[np.newaxis, 0,:], dtype=tf.float32), tf.convert_to_tensor(labels[np.newaxis, 0, :], dtype=tf.float32))
print("Step: {}, Initial Loss: {}".format(optimizer.iterations.numpy(),
                                          loss_value.numpy()))

I have copied many code from " Tensorflow - Custom training: walkthrough ".我从“ Tensorflow - 自定义培训:演练”中复制了许多代码。

My problem is that function grad returns 54 None .我的问题是 function grad返回 54 None

I have tried to add the code, now commented, tape.watch(model.trainable_variables) , but it still returns 54 None .我尝试添加代码,现在已注释掉, tape.watch(model.trainable_variables) ,但它仍然返回 54 None

Any idea about what I'm doing wrong?知道我做错了什么吗?

The problem is that you are using a NumPy function as part of your loss computation, and then converting the result of that function into a TensorFlow tensor again. The problem is that you are using a NumPy function as part of your loss computation, and then converting the result of that function into a TensorFlow tensor again. This gives a correct loss value, but interrupts the gradient chain registered by the gradient tape.这给出了正确的损失值,但中断了梯度带记录的梯度链。 Simply use the equivalent TensorFlow function instead, tf.norm :只需使用等效的 TensorFlow function 代替tf.norm

def loss(model, x, y):
  y_ = model(x)
  return tf.norm(y - y_)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 tf.gradients 到 tf.GradientTape - tf.gradients to tf.GradientTape tf.GradientTape() 不适用于切片输出 - tf.GradientTape() doesn't work on sliced outputs TF 2.0中的tf.GradientTape是否相当于tf.gradients? - Is tf.GradientTape in TF 2.0 equivalent to tf.gradients? 从 tf.gradients() 到 tf.GradientTape() 的转换返回 None - Conversion from tf.gradients() to tf.GradientTape() returns None 张量流概率中的重新参数化:tf.GradientTape()不计算相对于分布均值的梯度 - Reparametrization in tensorflow-probability: tf.GradientTape() doesn't calculate the gradient with respect to a distribution's mean 使用tf的预训练模型进行转移学习.GradientTape无法收敛 - Transfer learning with pretrained model by tf.GradientTape can't converge 启用急切执行时不支持 tf.gradients。 改用 tf.GradientTape - tf.gradients is not supported when eager execution is enabled. Use tf.GradientTape instead 使用 tf.GradientTape() 训练逻辑回归无法收敛 - Training logistic regression with tf.GradientTape() can't converge tf.GradientTape 无法在“with”块外观看 - tf.GradientTape can't watch outside 'with' block 使用 tf.GradientTape 的自定义 GAN 训练循环返回 [None] 作为生成器的梯度,而它适用于鉴别器 - Custom GAN training loop using tf.GradientTape returns [None] as gradients for generator while it works for discriminator
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM