简体   繁体   中英

How to reuse tensorflow variables in eager execution mode?

When calling get_variable() function in tensorflow, the behavior of the "reuse" flag is defined in the tensorflow api doc to be AUTO_REUSE:

reuse: True, None, or tf.AUTO_REUSE; ... When eager execution is enabled, this argument is always forced to be tf.AUTO_REUSE .

However when I really run the demo code as suggested in the webpage:

def foo():
  with tf.variable_scope("foo", reuse=tf.AUTO_REUSE):
    v = tf.get_variable("v", [1])
  return v
v1 = foo()  # Creates v.
v2 = foo()  # Gets the same, existing v.
assert v1 == v2

It fails. (It passes if the first line is removed, as expected.)

So how to reuse a variable in eager mode? Is this a bug or I'm missing anything?

In eager mode, things are simpler... except for people that have been brain damaged (like me) by using graphs models for too long.

Eager works in a standard fashion, where variables last only while they are referenced. If you stop referencing them, they are gone.

To do variable sharing, you do the same thing you would naturally do if you were to use numpy (or really anything else) to do the computation: you store variables in an object, and you reuse this object.

This is the reason why eager has so much affinity with the keras API, because keras deals mostly with objects.

So look again at your functions in terms of numpy for example (useful for those like me recovering from graphs). Would you expect two calls to foo to return the same array object? Of course not.

The documentation in tensorflow/python/ops/variable_scope.py seems to be updated now.

From line 310 :

"reuse: a Boolean, None, or tf.AUTO_REUSE. Controls reuse or creation of variables. When eager execution is enabled this argument is always forced to be False."

and from line 2107 :

"When eager execution is enabled, new variables are always created unless an EagerVariableStore or template is currently active."

I found it easiest to reuse variables in Eager Execution by simply passing a reference to the same variable around:

import tensorflow as tf
import numpy as np

class MyLayer(tf.keras.layers.Layer):
    def __init__(self):
        super(MyLayer, self).__init__()

    def build(self, input_shape):
        # bias specific for each layer
        self.B = self.add_variable('B', [1])

    def call(self, input, A):
        # some function involving input, common weights, and layer-specific bias
        return tf.matmul(input, A) + self.B

class MyModel(tf.keras.Model):    
    def __init__(self):
        super(MyModel, self).__init__()

    def build(self, input_shape):
        # common vector of weights
        self.A = self.add_variable('A', [int(input_shape[-1]), 1])

        # layers which will share A
        self.layer1 = MyLayer()
        self.layer2 = MyLayer()

    def call(self, input):
        result1 = self.layer1(input, self.A)
        result2 = self.layer2(input, self.A)
        return result1 + result2

if __name__ == "__main__":
    data = np.random.normal(size=(1000, 3))
    model = MyModel()
    predictions = model(data)
    print([v.name for v in model.trainable_variables])

The output is:


Thus, we have a shared weight parameter my_model/A of dimension 3 and two bias parameters my_model/my_layer/B and my_model/my_layer_1/B of dimension 1 each, for a total of 5 trainable parameters. The code runs on its own so feel free to play around with it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM