简体   繁体   English

如何在急切的执行模式中重用tensorflow变量?

[英]How to reuse tensorflow variables in eager execution mode?

When calling get_variable() function in tensorflow, the behavior of the "reuse" flag is defined in the tensorflow api doc to be AUTO_REUSE: 在tensorflow中调用get_variable()函数时,“reuse”标志的行为在tensorflow api doc中定义为AUTO_REUSE:

reuse: True, None, or tf.AUTO_REUSE; 重用:True,None或tf.AUTO_REUSE; ... When eager execution is enabled, this argument is always forced to be tf.AUTO_REUSE . ... 当启用eager执行时,此参数始终强制为tf.AUTO_REUSE

However when I really run the demo code as suggested in the webpage: 但是,当我真正按照网页中的建议运行演示代码时:

tf.enable_eager_execution()
def foo():
  with tf.variable_scope("foo", reuse=tf.AUTO_REUSE):
    v = tf.get_variable("v", [1])
  return v
v1 = foo()  # Creates v.
v2 = foo()  # Gets the same, existing v.
assert v1 == v2

It fails. 它失败。 (It passes if the first line is removed, as expected.) (如果第一行被删除,它会通过,如预期的那样。)

So how to reuse a variable in eager mode? 那么如何在急切模式下重用变量呢? Is this a bug or I'm missing anything? 这是一个错误还是我错过了什么?

In eager mode, things are simpler... except for people that have been brain damaged (like me) by using graphs models for too long. 在急切的模式下,事情变得更简单......除了那些因为使用图形模型太长时间而受到脑损伤的人(比如我)。

Eager works in a standard fashion, where variables last only while they are referenced. Eager以标准方式工作,其中变量仅在引用时才持续。 If you stop referencing them, they are gone. 如果你停止引用它们,它们就会消失。

To do variable sharing, you do the same thing you would naturally do if you were to use numpy (or really anything else) to do the computation: you store variables in an object, and you reuse this object. 要进行变量共享,如果你要使用numpy(或其他任何东西)进行计算,你会自然地做同样的事情:将变量存储在一个对象中,然后重用这个对象。

This is the reason why eager has so much affinity with the keras API, because keras deals mostly with objects. 这就是为什么渴望与keras API有如此多的亲和力的原因,因为keras主要处理对象。

So look again at your functions in terms of numpy for example (useful for those like me recovering from graphs). 所以再看一下你的函数就numpy而言(对于像我这样从图中恢复的人来说很有用)。 Would you expect two calls to foo to return the same array object? 你期望两次调用foo返回相同的数组对象吗? Of course not. 当然不是。

The documentation in tensorflow/python/ops/variable_scope.py seems to be updated now. tensorflow / python / ops / variable_scope.py中的文档现在似乎已更新。

From line 310 : 310行

"reuse: a Boolean, None, or tf.AUTO_REUSE. Controls reuse or creation of variables. When eager execution is enabled this argument is always forced to be False." “reuse:a Boolean,None或tf.AUTO_REUSE。控制重用或创建变量。当启用eager执行时,此参数始终强制为False。”

and from line 2107 : 第2107行

"When eager execution is enabled, new variables are always created unless an EagerVariableStore or template is currently active." “当启用eager执行时,除非EagerVariableStore或模板当前处于活动状态,否则始终会创建新变量。”

I found it easiest to reuse variables in Eager Execution by simply passing a reference to the same variable around: 通过简单地将引用传递给同一个变量,我发现在Eager Execution中重用变量是最容易的:

import tensorflow as tf
tf.enable_eager_execution()
import numpy as np

class MyLayer(tf.keras.layers.Layer):
    def __init__(self):
        super(MyLayer, self).__init__()

    def build(self, input_shape):
        # bias specific for each layer
        self.B = self.add_variable('B', [1])

    def call(self, input, A):
        # some function involving input, common weights, and layer-specific bias
        return tf.matmul(input, A) + self.B

class MyModel(tf.keras.Model):    
    def __init__(self):
        super(MyModel, self).__init__()

    def build(self, input_shape):
        # common vector of weights
        self.A = self.add_variable('A', [int(input_shape[-1]), 1])

        # layers which will share A
        self.layer1 = MyLayer()
        self.layer2 = MyLayer()

    def call(self, input):
        result1 = self.layer1(input, self.A)
        result2 = self.layer2(input, self.A)
        return result1 + result2

if __name__ == "__main__":
    data = np.random.normal(size=(1000, 3))
    model = MyModel()
    predictions = model(data)
    print('\n\n')
    model.summary()
    print('\n\n')
    print([v.name for v in model.trainable_variables])

The output is: 输出是:

在此输入图像描述

Thus, we have a shared weight parameter my_model/A of dimension 3 and two bias parameters my_model/my_layer/B and my_model/my_layer_1/B of dimension 1 each, for a total of 5 trainable parameters. 因此,我们有一个共同的权重参数my_model/A的维度3和两个偏差参数my_model/my_layer/Bmy_model/my_layer_1/B ,每个维度1,共有5个可训练参数。 The code runs on its own so feel free to play around with it. 代码自行运行,所以随意玩它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM