Tensorflow 2：为什么多层感知器神经网络中的`W`s和`b`s有相同的`name`？

Question

我是 TensorFlow 2 的新手并正在阅读文档： https ://www.tensorflow.org/api_docs/python/tf/Module

在此页面上，与我的问题相关的部分是： MLP （从那里复制粘贴）：

class MLP(tf.Module):
  def __init__(self, input_size, sizes, name=None):
    super(MLP, self).__init__(name=name)
    self.layers = []
    with self.name_scope:
      for size in sizes:
        self.layers.append(Dense(input_dim=input_size, output_size=size))
        input_size = size
  @tf.Module.with_name_scope
  def __call__(self, x):
    for layer in self.layers:
      x = layer(x)
    return x

而且我不明白为什么会输出以下内容：

>>> module = MLP(input_size=5, sizes=[5, 5])
>>> module.variables
(<tf.Variable 'mlp/b:0' shape=(5,) ...>,
<tf.Variable 'mlp/w:0' shape=(5, 5) ...>,
<tf.Variable 'mlp/b:0' shape=(5,) ...>,
<tf.Variable 'mlp/w:0' shape=(5, 5) ...>,
)

我期望mlp/b:1和mlp/w:1会出现在哪里。 我还在我的机器上尝试了相同的代码并在name上得到了相同的结果，即mlp/b:0和mlp/w:0都出现了两次。 谁能帮我指出我错过了哪一点？ 结果是否意味着相同的W ， b被重用？

Answer 1

从文档中，

一个 tf.Variable代表一个张量，其值可以通过对其运行操作来更改。 特定的操作允许您读取和修改此张量的值。 像 tf.keras 这样的高级库使用 tf.Variable 来存储模型参数。

:0绝不是层号。 它用于表示底层 API 中操作的输出张量。

例如， tf.Variable分配一个张量[:0] ，而通过tf.split的 3-way split 分配三个张量[:0,:1,:2]用于计算图中各自的操作

tf.Variable([1])
# has output
# <tf.Variable 'Variable:0' shape=(1,) dtype=int32, numpy=array([1], dtype=int32)>

和

tf.compat.v1.disable_eager_execution()
a,b,c = tf.split([1,1,1], 3)
print(a.name)   # split:0
print(b.name)   # split:1
print(c.name)   # split:3

参考这篇文章

Tensorflow 2：为什么多层感知器神经网络中的`W`s和`b`s有相同的`name`？

问题描述

1 个解决方案

解决方案1
1 2022-07-18 14:53:11

Tensorflow 2：为什么多层感知器神经网络中的`W`s和`b`s有相同的`name`？

问题描述

1 个解决方案

解决方案1 1 2022-07-18 14:53:11

解决方案1
1 2022-07-18 14:53:11