简体   繁体   中英

Tensorflow Keras Gradient Tape returns None for a trainable variable of one model which is impacted by trainable variable of other model

Just simple code generates None gradients. If i use other variable instead of "model_tmp.trainable_variables[0]" (tf.Variable b) everything would be ok and I get correct gradient

@tf.function
def cat(model, model_tmp):
    with tf.GradientTape(persistent=True, watch_accessed_variables=False) as g:
        g.watch(model.trainable_variables[0])
        model_tmp.trainable_variables[0] = tf.multiply(model.trainable_variables[0], 2)        
        a = tf.reduce_mean(model_tmp.trainable_variables[0])
        grads_out = g.gradient(a, model.trainable_variables[0])
        tf.print(grads_out) 
        return grads_out

cat(model, model2)

output:

None

model is a custom Keras model. model2 is a clone of first model (model2 = tf.keras.models.clone_model(model)) What's a possible root of this problem? Thanks

It's probably because you have to do the forward step on the model before TensorFlow can see the trainable variables of the model. You should run the forward step between the g.watch() and g.gradient() functions.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM