简体   繁体   中英

Confused about keras output

I am using the Keras functional API and I am confused about the output.

model = tf.keras.applications.ResNet50(include_top = False,
                                       weights = None,
                                       input_shape = (100,100,3),
                                       pooling = 'max',
                                       classifier_activation = 'relu')
layer_outputs = [layer.output for layer in model.layers[:15]]
model2 = tf.keras.models.Model(model.input, layer_outputs)
model2.summary()

_________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
=================================================================================
input_13 (InputLayer)           [(None, 100, 100, 3) 0                                            
_________________________________________________________________________________
conv1_pad (ZeroPadding2D)       (None, 106, 106, 3)  0           input_13[0][0]                   
_________________________________________________________________________________
conv1_conv (Conv2D)             (None, 50, 50, 64)   9472        conv1_pad[0][0]                  
_________________________________________________________________________________
conv1_bn (BatchNormalization)   (None, 50, 50, 64)   256         conv1_conv[0][0]                 
_________________________________________________________________________________
conv1_relu (Activation)         (None, 50, 50, 64)   0           conv1_bn[0][0]                   
_________________________________________________________________________________
pool1_pad (ZeroPadding2D)       (None, 52, 52, 64)   0           conv1_relu[0][0]                 
_________________________________________________________________________________
pool1_pool (MaxPooling2D)       (None, 25, 25, 64)   0           pool1_pad[0][0]                  
_________________________________________________________________________________
conv2_block1_1_conv (Conv2D)    (None, 25, 25, 64)   4160        pool1_pool[0][0]                 
_________________________________________________________________________________
conv2_block1_1_bn (BatchNormali (None, 25, 25, 64)   256         conv2_block1_1_conv[0][0]        
________________________________________________________________________________
conv2_block1_1_relu (Activation (None, 25, 25, 64)   0           conv2_block1_1_bn[0][0]          
_________________________________________________________________________________
conv2_block1_2_conv (Conv2D)    (None, 25, 25, 64)   36928       conv2_block1_1_relu[0][0]        
_________________________________________________________________________________
conv2_block1_2_bn (BatchNormali (None, 25, 25, 64)   256         conv2_block1_2_conv[0][0]        
_________________________________________________________________________________
conv2_block1_2_relu (Activation (None, 25, 25, 64)   0           conv2_block1_2_bn[0][0]          
_________________________________________________________________________________
conv2_block1_0_conv (Conv2D)    (None, 25, 25, 256)  16640       pool1_pool[0][0]                 
_________________________________________________________________________________
conv2_block1_3_conv (Conv2D)    (None, 25, 25, 256)  16640       conv2_block1_2_relu[0][0]        
=================================================================================
Total params: 84,608
Trainable params: 84,224
Non-trainable params: 384
input_anchor = tf.keras.layers.Input(shape = (100,100,3)) 
input_positive = tf.keras.layers.Input(shape = (100,100,3)) 
input_negative = tf.keras.layers.Input(shape = (100,100,3)) 

embedding_anchor = model2(input_anchor)
embedding_positive = model2(input_positive)
embedding_negative = model2(input_negative)

output = tf.keras.layers.concatenate([embedding_anchor[0],
                                      embedding_positive[0],
                                      embedding_negative[0]] , axis = -1)
siamese = tf.keras.models.Model([input_anchor, input_positive, input_negative], output)
siamese.summary()

_________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
=================================================================================
input_32 (InputLayer)           [(None, 100, 100, 3) 0                                            
_________________________________________________________________________________
input_33 (InputLayer)           [(None, 100, 100, 3) 0                                            
_________________________________________________________________________________
input_34 (InputLayer)           [(None, 100, 100, 3) 0                                            
_________________________________________________________________________________
model_5 (Functional)            [(None, 100, 100, 3) 84608       input_32[0][0]                   
                                                                 input_33[0][0]                   
                                                                 input_34[0][0]                   
_________________________________________________________________________________
concatenate_7 (Concatenate)     (None, 100, 100, 9)  0           model_5[15][0]                   
                                                                 model_5[16][0]                   
                                                                 model_5[17][0]                   
=================================================================================
Total params: 84,608
Trainable params: 84,224
Non-trainable params: 384

What I am confused about is why is my output is (None,100,100,9) when I thought it would be (None,25,25,768) . I have a feeling is something to with my model2 but I have no idea how to get the correct shape. Any help is much appreciated.

There is another issue in your code that you might want to consider. To make it useful, here we're adding some info on why you get this behavior as you found it confusing.


What makes you confuse

Let's say we've a model, M such as A - > B -> C -> D -> E -> F -> G , where each letters represent layers, starting with letter A input and G output. Now, let's say we have some input image, X and wants to get output feature maps only from layer D of model M . And for that, we simply would do (in tf. keras ) as follows

feat_model_a = tf.keras.Model(M.input, M.D.output)

Now, if for some reason we want the activation of all intermediate layers , only then we do as follows:

all_feat = [layer.output for layer in M.layers]
feat_model_b = tf.keras.Model(inputs = M.input, outputs = all_feat )

Now, you can realize these two models, feat_model_a and feat_model_b don't have the same number of output. For feat_model_a , it will produce a single output which is MDoutput , whereas, for feat_model_b , it will produce all_feat times layer's output.

Here is an example. using your code, observe the following two scenarios. First, we will build a model which will give all output feature maps from the base model (the same thing that you did)

# base model 
model = tf.keras.applications.ResNet50(include_top=False, 
                                       weights=None,input_shape=(100,100,3), 
                                       pooling='max', classifier_activation='relu')

# taking first 15 layers output, total 15 output 
layer_outputs = [layer.output for layer in model.layers[:15]]

model2 = tf.keras.models.Model(model.input, layer_outputs)
pred = model2(tf.ones((1, 100, 100, 3)))

print(len(pred))      # model2 produces 15 feature maps for 15 layers
print(pred[0].shape)  # will give feature maps for first layer 
print(pred[-1].shape) # will give feature maps for last layer

15
(1, 100, 100, 3) 
(1, 25, 25, 256)

Here you see, the model2 gives 15 output feature maps according to the design. Hopefully now you can relate, when you pass

model = tf.keras.applications.ResNet50(include_top=False, 
                                       weights=None,
                                       input_shape=(100,100,3), 
                                       pooling='max', classifier_activation='relu')
layer_outputs = [layer.output for layer in model.layers[:15]]
model2 = tf.keras.models.Model(model.input, layer_outputs)

input_anchor = tf.keras.layers.Input(shape=(100,100,3)) 
...
...
embedding_anchor = model2(input_anchor)

The first index of embedding_anchor[0] is not in the shape of (256 x 256) but in the shape of (100 x 100) - which is true as you designed the model2 that way. But if we want to get only the 15th output feature maps, we need to do as follows:

# base model 
model = tf.keras.applications.ResNet50(include_top=False, 
                                       weights=None,input_shape=(100,100,3), 
                                       pooling='max', classifier_activation='relu')

model2 = tf.keras.models.Model(model.input, model.layers[15].output)
pred = model2(tf.ones((1, 100, 100, 3)))

print(len(pred))      # 15th feature maps 
print(pred[0].shape)  # 15th only
print(pred[-1].shape) # 15th only

1
(25, 25, 256)
(25, 25, 256)

Hope that now your confusion is removed. I saw that in the official doc in the Functional API page on Extract and reuse nodes in the graph of layers , they demonstrate it but lack enough details. Whether we need all feature map output or a single layer output is totally depends on our need.


Pick appropriate layer for feature extraction

Here comes another issue I think you should consider. Plotting the keras model is not a fancy tool but a convenient way to debug the model's feature flow information. You choose the first 15 layers from ResNet but you forget if it posse some bridge within it. If you plot ResNet , and see the first 15 layer graph, you will find there is two conv layer get separated at 7 or 8 position - layer conv2_block1_1_conv1 and conv2_block1_0_conv . And they later get joined at position 17. So, when you choose position 15, you drop some layer operation too. That's why the current position should be at least layer 17th.

在此处输入图像描述 在此处输入图像描述


Here is the full working code

model = tf.keras.applications.ResNet50(include_top=False, 
                                       weights=None,input_shape=(100,100,3), 
                                       pooling='max', classifier_activation='relu')
model2 = tf.keras.models.Model(model.input, model.layers[17].output)
input_anchor   = tf.keras.layers.Input(shape=(100,100,3)) 
input_positive = tf.keras.layers.Input(shape=(100,100,3)) 
input_negative = tf.keras.layers.Input(shape=(100,100,3)) 

embedding_anchor   = model2(input_anchor)
embedding_positive = model2(input_positive)
embedding_negative = model2(input_negative)


output = tf.keras.layers.concatenate(
    [embedding_anchor[0], embedding_positive[0], embedding_negative[0]], 
    axis=-1)

siamese = tf.keras.models.Model([input_anchor, input_positive, input_negative], output)
tf.keras.utils.plot_model(siamese, show_shapes=True, 
                           show_layer_names=True, expand_nested=True)

在此处输入图像描述

You wrongly define model2 . Here the correct code

model = tf.keras.applications.ResNet50(include_top=False, weights=None,input_shape=(100,100,3), 
                                       pooling='max', classifier_activation='relu')
model2 = tf.keras.models.Model(model.input, model.layers[15].output)
model2.summary()

summary of model2 :

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_5 (InputLayer)         [(None, 100, 100, 3)]     0         
_________________________________________________________________
conv1_pad (ZeroPadding2D)    (None, 106, 106, 3)       0         
_________________________________________________________________
conv1_conv (Conv2D)          (None, 50, 50, 64)        9472      
_________________________________________________________________
conv1_bn (BatchNormalization (None, 50, 50, 64)        256       
_________________________________________________________________
conv1_relu (Activation)      (None, 50, 50, 64)        0         
_________________________________________________________________
pool1_pad (ZeroPadding2D)    (None, 52, 52, 64)        0         
_________________________________________________________________
pool1_pool (MaxPooling2D)    (None, 25, 25, 64)        0         
_________________________________________________________________
conv2_block1_0_conv (Conv2D) (None, 25, 25, 256)       16640     
_________________________________________________________________
conv2_block1_0_bn (BatchNorm (None, 25, 25, 256)       1024      
=================================================================
Total params: 27,392
Trainable params: 26,752
Non-trainable params: 640
_________________________________________________________________

define siamese model:

input_anchor = tf.keras.layers.Input(shape=(100,100,3)) 
input_positive = tf.keras.layers.Input(shape=(100,100,3)) 
input_negative = tf.keras.layers.Input(shape=(100,100,3)) 

embedding_anchor = model2(input_anchor)
embedding_positive = model2(input_positive)
embedding_negative =model2(input_negative)

output = tf.keras.layers.concatenate(
    [embedding_anchor[0], embedding_positive[0] ,embedding_negative[0]], 
    axis=-1)

siamese = tf.keras.models.Model([input_anchor, input_positive, input_negative], output)
siamese.summary()

summary of siamese model:

Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_6 (InputLayer)            [(None, 100, 100, 3) 0                                            
__________________________________________________________________________________________________
input_7 (InputLayer)            [(None, 100, 100, 3) 0                                            
__________________________________________________________________________________________________
input_8 (InputLayer)            [(None, 100, 100, 3) 0                                            
__________________________________________________________________________________________________
functional_5 (Functional)       (None, 25, 25, 256)  27392       input_6[0][0]                    
                                                                 input_7[0][0]                    
                                                                 input_8[0][0]                    
__________________________________________________________________________________________________
tf_op_layer_strided_slice (Tens [(25, 25, 256)]      0           functional_5[0][0]               
__________________________________________________________________________________________________
tf_op_layer_strided_slice_1 (Te [(25, 25, 256)]      0           functional_5[1][0]               
__________________________________________________________________________________________________
tf_op_layer_strided_slice_2 (Te [(25, 25, 256)]      0           functional_5[2][0]               
__________________________________________________________________________________________________
concatenate_1 (Concatenate)     (25, 25, 768)        0           tf_op_layer_strided_slice[0][0]  
                                                                 tf_op_layer_strided_slice_1[0][0]
                                                                 tf_op_layer_strided_slice_2[0][0]
==================================================================================================
Total params: 27,392
Trainable params: 26,752
Non-trainable params: 640
__________________________________________________________________________________________________

Now the output shape is (None,25,25,768) as expected

With the same logic, I suggest you consider truncating your Resnet at layer[17] . This is fundamental to include the residual connection that happens at the top of your model. So it becames:

 model = tf.keras.applications.ResNet50(include_top=False, weights=None,input_shape=(100,100,3), 
                                       pooling='max', classifier_activation='relu')
model2 = tf.keras.models.Model(model.input, model.layers[17].output)
model2.summary()

Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            [(None, 100, 100, 3) 0                                            
__________________________________________________________________________________________________
conv1_pad (ZeroPadding2D)       (None, 106, 106, 3)  0           input_1[0][0]                    
__________________________________________________________________________________________________
conv1_conv (Conv2D)             (None, 50, 50, 64)   9472        conv1_pad[0][0]                  
__________________________________________________________________________________________________
conv1_bn (BatchNormalization)   (None, 50, 50, 64)   256         conv1_conv[0][0]                 
__________________________________________________________________________________________________
conv1_relu (Activation)         (None, 50, 50, 64)   0           conv1_bn[0][0]                   
__________________________________________________________________________________________________
pool1_pad (ZeroPadding2D)       (None, 52, 52, 64)   0           conv1_relu[0][0]                 
__________________________________________________________________________________________________
pool1_pool (MaxPooling2D)       (None, 25, 25, 64)   0           pool1_pad[0][0]                  
__________________________________________________________________________________________________
conv2_block1_1_conv (Conv2D)    (None, 25, 25, 64)   4160        pool1_pool[0][0]                 
__________________________________________________________________________________________________
conv2_block1_1_bn (BatchNormali (None, 25, 25, 64)   256         conv2_block1_1_conv[0][0]        
__________________________________________________________________________________________________
conv2_block1_1_relu (Activation (None, 25, 25, 64)   0           conv2_block1_1_bn[0][0]          
__________________________________________________________________________________________________
conv2_block1_2_conv (Conv2D)    (None, 25, 25, 64)   36928       conv2_block1_1_relu[0][0]        
__________________________________________________________________________________________________
conv2_block1_2_bn (BatchNormali (None, 25, 25, 64)   256         conv2_block1_2_conv[0][0]        
__________________________________________________________________________________________________
conv2_block1_2_relu (Activation (None, 25, 25, 64)   0           conv2_block1_2_bn[0][0]          
__________________________________________________________________________________________________
conv2_block1_0_conv (Conv2D)    (None, 25, 25, 256)  16640       pool1_pool[0][0]                 
__________________________________________________________________________________________________
conv2_block1_3_conv (Conv2D)    (None, 25, 25, 256)  16640       conv2_block1_2_relu[0][0]        
__________________________________________________________________________________________________
conv2_block1_0_bn (BatchNormali (None, 25, 25, 256)  1024        conv2_block1_0_conv[0][0]        
__________________________________________________________________________________________________
conv2_block1_3_bn (BatchNormali (None, 25, 25, 256)  1024        conv2_block1_3_conv[0][0]        
__________________________________________________________________________________________________
conv2_block1_add (Add)          (None, 25, 25, 256)  0           conv2_block1_0_bn[0][0]          
                                                                 conv2_block1_3_bn[0][0]          
==================================================================================================
Total params: 86,656
Trainable params: 85,248
Non-trainable params: 1,408
__________________________________________________________________________________________________

Maintaining the desired output shape as (None,25,25,768)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM