Tensorflow Keras Gradient Tape returns None for a trainable variable of one model which is impacted by trainable variable of other model

Question

Just simple code generates None gradients. If i use other variable instead of "model_tmp.trainable_variables[0]" (tf.Variable b) everything would be ok and I get correct gradient

@tf.function
def cat(model, model_tmp):
    with tf.GradientTape(persistent=True, watch_accessed_variables=False) as g:
        g.watch(model.trainable_variables[0])
        model_tmp.trainable_variables[0] = tf.multiply(model.trainable_variables[0], 2)        
        a = tf.reduce_mean(model_tmp.trainable_variables[0])
        grads_out = g.gradient(a, model.trainable_variables[0])
        tf.print(grads_out) 
        return grads_out

cat(model, model2)

output:

None

model is a custom Keras model. model2 is a clone of first model (model2 = tf.keras.models.clone_model(model)) What's a possible root of this problem? Thanks

Answer 1

It's probably because you have to do the forward step on the model before TensorFlow can see the trainable variables of the model. You should run the forward step between the g.watch() and g.gradient() functions.

Tensorflow Keras Gradient Tape returns None for a trainable variable of one model which is impacted by trainable variable of other model

Question

1 answers

solution1
1 2021-03-26 10:27:15

Tensorflow Keras Gradient Tape returns None for a trainable variable of one model which is impacted by trainable variable of other model

Question

1 answers

solution1 1 2021-03-26 10:27:15

solution1
1 2021-03-26 10:27:15