[英]Cannot obtain the output of intermediate sub-model layers with tf2.0/keras
Say we use TensorFlow r2.0 and say we want to use sub-models in Keras, so we have a model A like this:假设我们使用 TensorFlow r2.0 并说我们想在 Keras 中使用子模型,所以我们有一个 model A像这样:
def create_model_A(in_num_units, name):
x = tf.keras.Input(shape=(in_num_units))
y = tf.keras.layers.Dense(in_num_units, activation='relu')(x)
y = tf.keras.layers.BatchNormalization()(y)
y = tf.keras.layers.Dense(in_num_units, activation='relu')(y)
y = tf.keras.layers.BatchNormalization()(y)
y = x + y
return tf.keras.Model(x, y, name=name)
and model B that makes use of model A :和使用 model A的 model B :
def create_model_B(in_num_units):
x = tf.keras.Input(shape=(in_num_units))
y = create_model_A(x.shape[-1], name='A_1')(x)
y = tf.keras.layers.Dense(in_num_units // 2, name='downsize_1')(y)
y = tf.keras.layers.BatchNormalization()(y)
y = create_model_A(y.shape[-1], name='A_2')(y)
y = tf.keras.layers.Dense(in_num_units // 2, name='downsize_2')(y)
y = tf.keras.layers.BatchNormalization()(y)
y = create_model_A(y.shape[-1], name='A_3')(y)
y = tf.keras.layers.Dense(in_num_units // 2, name='downsize_3')(y)
y = tf.keras.layers.BatchNormalization()(y)
y = create_model_A(y.shape[-1], name='A_4')(y)
return tf.keras.Model(x, y)
This works like a charm.这就像一个魅力。 We can instantiate a model B like this:我们可以像这样实例化 model B :
num_in_units = 500
model = create_model_B(num_in_units) # Works!
And benefit from all the advantages of a tf.keras.Model.并受益于 tf.keras.Model 的所有优势。 But, the problem arises when we want to obtain the result of an intermediate layer that is of sub-model A .但是,当我们想要获得子模型A的中间层的结果时,问题就出现了。 If the layer is part of model B , everything works:如果该层是 model B的一部分,则一切正常:
inter_model_1 = tf.keras.Model(
model.input, model.get_layer('downsize_1').output) # Works!
But if the layer is of sub-model A, it crashes with a ValueError
.但如果该层属于子模型 A,它会因ValueError
崩溃。 This command:这个命令:
inter_model_2 = tf.keras.Model(
model.input, model.get_layer('A_3').output) # Does not work!
Gives:给出:
ValueError: Graph disconnected: cannot obtain value for tensor Tensor("input_4:0", shape=(None, 250), dtype=float32) at layer "input_4". The following previous layers were accessed without issue: []
I'm not sure I understand all the inside mechanics of keras.我不确定我是否了解 keras 的所有内部机制。 But what I understand when I dove in the source code, is that sub-models used this way have two input tensors object created.但是当我深入研究源代码时,我了解到,以这种方式使用的子模型有两个输入张量 object 创建。 They can be printed like this:它们可以像这样打印:
print([n.input_tensors.name for n in model.get_layer('A_3').inbound_nodes])
['input_4:0', 'batch_normalization_5/Identity:0']
One is the sub-model tf.keras.Input
and the other one is the input linked to the top-model.一个是子模型tf.keras.Input
,另一个是链接到顶级模型的输入。
When building a new model from a top-model B input tensor to a top-model B output tensor, the path in the graph seems to correctly pass by the 'batch_normalization_5'
input and both tensors are correctly connected in the graph.当从顶级模型B输入张量到顶级模型B output 张量构建新的 model 时,图中的路径似乎正确地通过了'batch_normalization_5'
输入,并且两个张量在图中都正确连接。
However, when trying to link a top-model B input tensor to a sub-model A output tensor, the output tensors seems to be connected to the sub-model tf.keras.Input
and both tensors are disconnected.但是,当试图将顶级模型B输入张量链接到子模型A output 张量时,output 张量似乎连接到子模型tf.keras.Input
A solution I found at the moment is to use the top-model version of the tensor model.get_layer('A_3').output
:我目前找到的一个解决方案是使用张量model.get_layer('A_3').output
的顶级模型版本:
model.get_layer('A_3')._outbound_nodes[0].input_tensors
But this seems overcomplicated and not clean... In addition, it does not let us make use of the layers inside model A .但这似乎过于复杂且不干净......此外,它不允许我们使用 model A内部的层。
I wonder if someone could give me some precision on this particular tf.keras
behavior.我想知道是否有人可以给我一些关于这个特定tf.keras
行为的精确度。 Am I right to do so?我这样做是对的吗? Is this the intended behavior?这是预期的行为吗? Is this a bug?这是一个错误吗? Thanks a lot!非常感谢!
just change output
to get_output_at(0)
只需将output
更改为get_output_at(0)
inter_model_2 = tf.keras.Model(
model.input, model.get_layer('A_3').get_output_at(0))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.