I am using the Keras
functional API and I am confused about the output.
model = tf.keras.applications.ResNet50(include_top = False,
weights = None,
input_shape = (100,100,3),
pooling = 'max',
classifier_activation = 'relu')
layer_outputs = [layer.output for layer in model.layers[:15]]
model2 = tf.keras.models.Model(model.input, layer_outputs)
model2.summary()
_________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
=================================================================================
input_13 (InputLayer) [(None, 100, 100, 3) 0
_________________________________________________________________________________
conv1_pad (ZeroPadding2D) (None, 106, 106, 3) 0 input_13[0][0]
_________________________________________________________________________________
conv1_conv (Conv2D) (None, 50, 50, 64) 9472 conv1_pad[0][0]
_________________________________________________________________________________
conv1_bn (BatchNormalization) (None, 50, 50, 64) 256 conv1_conv[0][0]
_________________________________________________________________________________
conv1_relu (Activation) (None, 50, 50, 64) 0 conv1_bn[0][0]
_________________________________________________________________________________
pool1_pad (ZeroPadding2D) (None, 52, 52, 64) 0 conv1_relu[0][0]
_________________________________________________________________________________
pool1_pool (MaxPooling2D) (None, 25, 25, 64) 0 pool1_pad[0][0]
_________________________________________________________________________________
conv2_block1_1_conv (Conv2D) (None, 25, 25, 64) 4160 pool1_pool[0][0]
_________________________________________________________________________________
conv2_block1_1_bn (BatchNormali (None, 25, 25, 64) 256 conv2_block1_1_conv[0][0]
________________________________________________________________________________
conv2_block1_1_relu (Activation (None, 25, 25, 64) 0 conv2_block1_1_bn[0][0]
_________________________________________________________________________________
conv2_block1_2_conv (Conv2D) (None, 25, 25, 64) 36928 conv2_block1_1_relu[0][0]
_________________________________________________________________________________
conv2_block1_2_bn (BatchNormali (None, 25, 25, 64) 256 conv2_block1_2_conv[0][0]
_________________________________________________________________________________
conv2_block1_2_relu (Activation (None, 25, 25, 64) 0 conv2_block1_2_bn[0][0]
_________________________________________________________________________________
conv2_block1_0_conv (Conv2D) (None, 25, 25, 256) 16640 pool1_pool[0][0]
_________________________________________________________________________________
conv2_block1_3_conv (Conv2D) (None, 25, 25, 256) 16640 conv2_block1_2_relu[0][0]
=================================================================================
Total params: 84,608
Trainable params: 84,224
Non-trainable params: 384
input_anchor = tf.keras.layers.Input(shape = (100,100,3))
input_positive = tf.keras.layers.Input(shape = (100,100,3))
input_negative = tf.keras.layers.Input(shape = (100,100,3))
embedding_anchor = model2(input_anchor)
embedding_positive = model2(input_positive)
embedding_negative = model2(input_negative)
output = tf.keras.layers.concatenate([embedding_anchor[0],
embedding_positive[0],
embedding_negative[0]] , axis = -1)
siamese = tf.keras.models.Model([input_anchor, input_positive, input_negative], output)
siamese.summary()
_________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
=================================================================================
input_32 (InputLayer) [(None, 100, 100, 3) 0
_________________________________________________________________________________
input_33 (InputLayer) [(None, 100, 100, 3) 0
_________________________________________________________________________________
input_34 (InputLayer) [(None, 100, 100, 3) 0
_________________________________________________________________________________
model_5 (Functional) [(None, 100, 100, 3) 84608 input_32[0][0]
input_33[0][0]
input_34[0][0]
_________________________________________________________________________________
concatenate_7 (Concatenate) (None, 100, 100, 9) 0 model_5[15][0]
model_5[16][0]
model_5[17][0]
=================================================================================
Total params: 84,608
Trainable params: 84,224
Non-trainable params: 384
What I am confused about is why is my output is (None,100,100,9)
when I thought it would be (None,25,25,768)
. I have a feeling is something to with my model2 but I have no idea how to get the correct shape. Any help is much appreciated.
There is another issue in your code that you might want to consider. To make it useful, here we're adding some info on why you get this behavior as you found it confusing.
Let's say we've a model, M
such as A - > B -> C -> D -> E -> F -> G
, where each letters represent layers, starting with letter A
input and G
output. Now, let's say we have some input image, X
and wants to get output feature maps only from layer D
of model M
. And for that, we simply would do (in tf. keras
) as follows
feat_model_a = tf.keras.Model(M.input, M.D.output)
Now, if for some reason we want the activation of all intermediate layers , only then we do as follows:
all_feat = [layer.output for layer in M.layers]
feat_model_b = tf.keras.Model(inputs = M.input, outputs = all_feat )
Now, you can realize these two models, feat_model_a
and feat_model_b
don't have the same number of output. For feat_model_a
, it will produce a single output which is MDoutput
, whereas, for feat_model_b
, it will produce all_feat
times layer's output.
Here is an example. using your code, observe the following two scenarios. First, we will build a model which will give all output feature maps from the base model (the same thing that you did)
# base model
model = tf.keras.applications.ResNet50(include_top=False,
weights=None,input_shape=(100,100,3),
pooling='max', classifier_activation='relu')
# taking first 15 layers output, total 15 output
layer_outputs = [layer.output for layer in model.layers[:15]]
model2 = tf.keras.models.Model(model.input, layer_outputs)
pred = model2(tf.ones((1, 100, 100, 3)))
print(len(pred)) # model2 produces 15 feature maps for 15 layers
print(pred[0].shape) # will give feature maps for first layer
print(pred[-1].shape) # will give feature maps for last layer
15
(1, 100, 100, 3)
(1, 25, 25, 256)
Here you see, the model2
gives 15 output feature maps according to the design. Hopefully now you can relate, when you pass
model = tf.keras.applications.ResNet50(include_top=False,
weights=None,
input_shape=(100,100,3),
pooling='max', classifier_activation='relu')
layer_outputs = [layer.output for layer in model.layers[:15]]
model2 = tf.keras.models.Model(model.input, layer_outputs)
input_anchor = tf.keras.layers.Input(shape=(100,100,3))
...
...
embedding_anchor = model2(input_anchor)
The first index of embedding_anchor[0]
is not in the shape of (256 x 256)
but in the shape of (100 x 100)
- which is true as you designed the model2
that way. But if we want to get only the 15th output feature maps, we need to do as follows:
# base model
model = tf.keras.applications.ResNet50(include_top=False,
weights=None,input_shape=(100,100,3),
pooling='max', classifier_activation='relu')
model2 = tf.keras.models.Model(model.input, model.layers[15].output)
pred = model2(tf.ones((1, 100, 100, 3)))
print(len(pred)) # 15th feature maps
print(pred[0].shape) # 15th only
print(pred[-1].shape) # 15th only
1
(25, 25, 256)
(25, 25, 256)
Hope that now your confusion is removed. I saw that in the official doc in the Functional API page on Extract and reuse nodes in the graph of layers , they demonstrate it but lack enough details. Whether we need all feature map output or a single layer output is totally depends on our need.
Here comes another issue I think you should consider. Plotting the keras model is not a fancy tool but a convenient way to debug the model's feature flow information. You choose the first 15 layers from ResNet
but you forget if it posse some bridge within it. If you plot ResNet
, and see the first 15 layer graph, you will find there is two conv
layer get separated at 7 or 8 position - layer conv2_block1_1_conv1
and conv2_block1_0_conv
. And they later get joined at position 17. So, when you choose position 15, you drop some layer operation too. That's why the current position should be at least layer 17th.
Here is the full working code
model = tf.keras.applications.ResNet50(include_top=False,
weights=None,input_shape=(100,100,3),
pooling='max', classifier_activation='relu')
model2 = tf.keras.models.Model(model.input, model.layers[17].output)
input_anchor = tf.keras.layers.Input(shape=(100,100,3))
input_positive = tf.keras.layers.Input(shape=(100,100,3))
input_negative = tf.keras.layers.Input(shape=(100,100,3))
embedding_anchor = model2(input_anchor)
embedding_positive = model2(input_positive)
embedding_negative = model2(input_negative)
output = tf.keras.layers.concatenate(
[embedding_anchor[0], embedding_positive[0], embedding_negative[0]],
axis=-1)
siamese = tf.keras.models.Model([input_anchor, input_positive, input_negative], output)
tf.keras.utils.plot_model(siamese, show_shapes=True,
show_layer_names=True, expand_nested=True)
You wrongly define model2
. Here the correct code
model = tf.keras.applications.ResNet50(include_top=False, weights=None,input_shape=(100,100,3),
pooling='max', classifier_activation='relu')
model2 = tf.keras.models.Model(model.input, model.layers[15].output)
model2.summary()
summary of model2
:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_5 (InputLayer) [(None, 100, 100, 3)] 0
_________________________________________________________________
conv1_pad (ZeroPadding2D) (None, 106, 106, 3) 0
_________________________________________________________________
conv1_conv (Conv2D) (None, 50, 50, 64) 9472
_________________________________________________________________
conv1_bn (BatchNormalization (None, 50, 50, 64) 256
_________________________________________________________________
conv1_relu (Activation) (None, 50, 50, 64) 0
_________________________________________________________________
pool1_pad (ZeroPadding2D) (None, 52, 52, 64) 0
_________________________________________________________________
pool1_pool (MaxPooling2D) (None, 25, 25, 64) 0
_________________________________________________________________
conv2_block1_0_conv (Conv2D) (None, 25, 25, 256) 16640
_________________________________________________________________
conv2_block1_0_bn (BatchNorm (None, 25, 25, 256) 1024
=================================================================
Total params: 27,392
Trainable params: 26,752
Non-trainable params: 640
_________________________________________________________________
define siamese
model:
input_anchor = tf.keras.layers.Input(shape=(100,100,3))
input_positive = tf.keras.layers.Input(shape=(100,100,3))
input_negative = tf.keras.layers.Input(shape=(100,100,3))
embedding_anchor = model2(input_anchor)
embedding_positive = model2(input_positive)
embedding_negative =model2(input_negative)
output = tf.keras.layers.concatenate(
[embedding_anchor[0], embedding_positive[0] ,embedding_negative[0]],
axis=-1)
siamese = tf.keras.models.Model([input_anchor, input_positive, input_negative], output)
siamese.summary()
summary of siamese
model:
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_6 (InputLayer) [(None, 100, 100, 3) 0
__________________________________________________________________________________________________
input_7 (InputLayer) [(None, 100, 100, 3) 0
__________________________________________________________________________________________________
input_8 (InputLayer) [(None, 100, 100, 3) 0
__________________________________________________________________________________________________
functional_5 (Functional) (None, 25, 25, 256) 27392 input_6[0][0]
input_7[0][0]
input_8[0][0]
__________________________________________________________________________________________________
tf_op_layer_strided_slice (Tens [(25, 25, 256)] 0 functional_5[0][0]
__________________________________________________________________________________________________
tf_op_layer_strided_slice_1 (Te [(25, 25, 256)] 0 functional_5[1][0]
__________________________________________________________________________________________________
tf_op_layer_strided_slice_2 (Te [(25, 25, 256)] 0 functional_5[2][0]
__________________________________________________________________________________________________
concatenate_1 (Concatenate) (25, 25, 768) 0 tf_op_layer_strided_slice[0][0]
tf_op_layer_strided_slice_1[0][0]
tf_op_layer_strided_slice_2[0][0]
==================================================================================================
Total params: 27,392
Trainable params: 26,752
Non-trainable params: 640
__________________________________________________________________________________________________
Now the output shape is (None,25,25,768)
as expected
With the same logic, I suggest you consider truncating your Resnet at layer[17]
. This is fundamental to include the residual connection that happens at the top of your model. So it becames:
model = tf.keras.applications.ResNet50(include_top=False, weights=None,input_shape=(100,100,3),
pooling='max', classifier_activation='relu')
model2 = tf.keras.models.Model(model.input, model.layers[17].output)
model2.summary()
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, 100, 100, 3) 0
__________________________________________________________________________________________________
conv1_pad (ZeroPadding2D) (None, 106, 106, 3) 0 input_1[0][0]
__________________________________________________________________________________________________
conv1_conv (Conv2D) (None, 50, 50, 64) 9472 conv1_pad[0][0]
__________________________________________________________________________________________________
conv1_bn (BatchNormalization) (None, 50, 50, 64) 256 conv1_conv[0][0]
__________________________________________________________________________________________________
conv1_relu (Activation) (None, 50, 50, 64) 0 conv1_bn[0][0]
__________________________________________________________________________________________________
pool1_pad (ZeroPadding2D) (None, 52, 52, 64) 0 conv1_relu[0][0]
__________________________________________________________________________________________________
pool1_pool (MaxPooling2D) (None, 25, 25, 64) 0 pool1_pad[0][0]
__________________________________________________________________________________________________
conv2_block1_1_conv (Conv2D) (None, 25, 25, 64) 4160 pool1_pool[0][0]
__________________________________________________________________________________________________
conv2_block1_1_bn (BatchNormali (None, 25, 25, 64) 256 conv2_block1_1_conv[0][0]
__________________________________________________________________________________________________
conv2_block1_1_relu (Activation (None, 25, 25, 64) 0 conv2_block1_1_bn[0][0]
__________________________________________________________________________________________________
conv2_block1_2_conv (Conv2D) (None, 25, 25, 64) 36928 conv2_block1_1_relu[0][0]
__________________________________________________________________________________________________
conv2_block1_2_bn (BatchNormali (None, 25, 25, 64) 256 conv2_block1_2_conv[0][0]
__________________________________________________________________________________________________
conv2_block1_2_relu (Activation (None, 25, 25, 64) 0 conv2_block1_2_bn[0][0]
__________________________________________________________________________________________________
conv2_block1_0_conv (Conv2D) (None, 25, 25, 256) 16640 pool1_pool[0][0]
__________________________________________________________________________________________________
conv2_block1_3_conv (Conv2D) (None, 25, 25, 256) 16640 conv2_block1_2_relu[0][0]
__________________________________________________________________________________________________
conv2_block1_0_bn (BatchNormali (None, 25, 25, 256) 1024 conv2_block1_0_conv[0][0]
__________________________________________________________________________________________________
conv2_block1_3_bn (BatchNormali (None, 25, 25, 256) 1024 conv2_block1_3_conv[0][0]
__________________________________________________________________________________________________
conv2_block1_add (Add) (None, 25, 25, 256) 0 conv2_block1_0_bn[0][0]
conv2_block1_3_bn[0][0]
==================================================================================================
Total params: 86,656
Trainable params: 85,248
Non-trainable params: 1,408
__________________________________________________________________________________________________
Maintaining the desired output shape as (None,25,25,768)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.