Just simple code generates None gradients. If i use other variable instead of "model_tmp.trainable_variables[0]" (tf.Variable b) everything would be ok and I get correct gradient
@tf.function
def cat(model, model_tmp):
with tf.GradientTape(persistent=True, watch_accessed_variables=False) as g:
g.watch(model.trainable_variables[0])
model_tmp.trainable_variables[0] = tf.multiply(model.trainable_variables[0], 2)
a = tf.reduce_mean(model_tmp.trainable_variables[0])
grads_out = g.gradient(a, model.trainable_variables[0])
tf.print(grads_out)
return grads_out
cat(model, model2)
output:
None
model is a custom Keras model. model2 is a clone of first model (model2 = tf.keras.models.clone_model(model)) What's a possible root of this problem? Thanks
It's probably because you have to do the forward step on the model before TensorFlow can see the trainable variables of the model. You should run the forward step between the g.watch() and g.gradient() functions.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.