简体   繁体   中英

keras functional api and multiple merge layers

I am trying to design a neural network using keras. The model.summary() output is different compared to defined layers

import numpy as np
np.random.seed(1337) 

from keras.models import Model
from keras.layers import Input, Convolution2D, MaxPooling2D, Activation, Flatten, merge

from keras import backend as K
K.set_image_dim_ordering('th')

input_shape = (3, 225, 225)
inp = Input(input_shape)

seq0 = MaxPooling2D(pool_size=(2, 2), strides=(2, 2), border_mode="same")(inp)
seq1 = Convolution2D(32, 1, 1, border_mode="same", activation="relu")(seq0)
seq2 = Convolution2D(32, 1, 1, border_mode="same", activation="relu")(seq1)
seq3 = merge([seq2, seq1], mode="concat", concat_axis=1)
seq4 = Convolution2D(32, 1, 1, border_mode="same", activation="relu")(seq3)
seq5 = merge([seq1, seq3], mode="concat", concat_axis=1)
seq6 = Convolution2D(128, 5, 5, border_mode="same", activation="relu")(seq5)
seq7 = merge([seq4, seq3], mode="concat", concat_axis=1)
seq8 = Convolution2D(512, 3, 3, border_mode="same", activation="relu")(seq7)
seq9 = merge([seq5, seq2], mode="concat", concat_axis=1)

seq = Flatten()(seq9)
out = Activation('softmax')(seq)


model = Model(input=inp, output=out)  
model.summary()

model.summary() output

Layer (type)                     Output Shape          Param #     Connected to                     
====================================================================================================
input_1 (InputLayer)             (None, 3, 225, 225)   0                                            
____________________________________________________________________________________________________
maxpooling2d_1 (MaxPooling2D)    (None, 3, 113, 113)   0           input_1[0][0]                    
____________________________________________________________________________________________________
convolution2d_1 (Convolution2D)  (None, 32, 113, 113)  128         maxpooling2d_1[0][0]             
____________________________________________________________________________________________________
convolution2d_2 (Convolution2D)  (None, 32, 113, 113)  1056        convolution2d_1[0][0]            
____________________________________________________________________________________________________
merge_1 (Merge)                  (None, 64, 113, 113)  0           convolution2d_2[0][0]            
                                                                   convolution2d_1[0][0]            
____________________________________________________________________________________________________
merge_2 (Merge)                  (None, 96, 113, 113)  0           convolution2d_1[0][0]            
                                                                   merge_1[0][0]                    
____________________________________________________________________________________________________
merge_4 (Merge)                  (None, 128, 113, 113) 0           merge_2[0][0]                    
                                                                   convolution2d_2[0][0]            
____________________________________________________________________________________________________
flatten_1 (Flatten)              (None, 1634432)       0           merge_4[0][0]                    
____________________________________________________________________________________________________
activation_1 (Activation)        (None, 1634432)       0           flatten_1[0][0]                  
====================================================================================================

seq4, seq6, seq8 layers are missing in the model.summary() output. Am I doing something wrong?

You are not using them to compute the output.

Take for example seq4 : you feed it to seq7, which is fed to seq8 which doesnt go anywhere.

There is a problem in the structure of the tree of you model.

In the one that is summarised, they took all the layers that lead to output=out from input=inp , the ones that aren't used in that "path" won't be part of the graph of your model.

The flow going through seq4, seq6, seq7 and seq8 are not leading to the output of your model.

Does that help you?

EDIT :

The merge layer functions like this example from your code :

seq3 = merge([seq2, seq1], mode="concat", concat_axis=1)

Here you take what comes out of the layers seq2 and seq1, they have shapes = (None,32,113,113) . That's the input of that layer, two different tensors coming out of seq2 and seq1. You precised that you want to concatenate those 4D tensors following the axis 1. Therefore, the output of that merge layer will be of shape = (None,64,113,113) . the two 32 have been added together during concatenation. You can read what I've just explained at the line "merge_1" of your model.summary()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM