Transfer Learning - Merging my top layers with pretrained model drops accuracy to 0%

Question

My goal here is to attach my top layers to a pre-trained model like VGG19 and make some prediction using the merged model. The merged model has 0 accuracy. Need a bit of help.

my own top layers

from keras.layers import Conv2D, MaxPooling2D, GlobalAveragePooling2D
from keras.layers import Dropout, Flatten, Dense
from keras.models import Sequential

vgg19top_model = Sequential()
vgg19top_model.add(GlobalAveragePooling2D(input_shape=train_vgg19.shape[1:]))  # shape=(7, 7, 512)
vgg19top_model.add(Dense(255, activation='relu'))
vgg19top_model.add(Dropout(0.35))
vgg19top_model.add(Dense(133, activation='softmax'))
vgg19top_model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
global_average_pooling2d_1 ( (None, 512)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 255)               130815    
_________________________________________________________________
dropout_1 (Dropout)          (None, 255)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 133)               34048     
=================================================================
Total params: 164,863
Trainable params: 164,863
Non-trainable params: 0

trained my top model on bottleneck features and got 72% accuracy

reloading those weights here
code not shown

load VGG19 bottom layers to merge with my top layers

from keras import applications
vgg19=applications.vgg19.VGG19(include_top=False, weights='imagenet',input_shape=(224, 224, 3))
vgg19.summary()

Layer (type)                 Output Shape              Param #   
=================================================================
input_3 (InputLayer)         (None, 224, 224, 3)       0         
_________________________________________________________________
...
...
_________________________________________________________________
block5_conv4 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 7, 7, 512)         0         
=================================================================
Total params: 20,024,384
Trainable params: 20,024,384
Non-trainable params: 0

merge the 2 models

from keras.layers import Input, Dense
from keras.models import Model

global_average_pooling2d_7 = vgg19.get_layer('block5_pool')  # shape=(?, 7, 7, 512)
bn_conv1_model = Model(inputs=vgg19.input, outputs=global_average_pooling2d_7.output)

new_model = Sequential()
new_model.add(bn_conv1_model)
new_model.add(vgg19top_model)
new_model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
model_12 (Model) <-VGG19     (None, 7, 7, 512)         20024384  
_________________________________________________________________
sequential_6 (Sequential)    (None, 133)               164863    
=================================================================
Total params: 20,189,247
Trainable params: 164,863
Non-trainable params: 20,024,384

now let's test the merged model end to end on some predictions

and it completely fails with 0% accuracy

How can I test this new model end to end - or rather why are its predictions so bad?

Answer 1

I think, from how you do it, you are stacking two VGG19 models, but the first one only some layers of the VGG19.

That is not the best way to improve your accuracy. First, it will just increase the parameters of the network because you combine the models sequentially, the computation will be so heavy. Second, it will not improve the accuracy because the first model will take features of the low-level as well as the higher level, but more unessecary complex features are extracted by the next CNN layers from the second model.

You might want to try another method like Siamese CNN, or bidirectional CNN (BCNN). The idea is that we feed the train set to two CNN models, and then we merge the output of these two CNN. It is proved that this method can extract variant of features of input images.

You can visit this website for more information about BCNN http://vis-www.cs.umass.edu/bcnn/

Transfer Learning - Merging my top layers with pretrained model drops accuracy to 0%

Question

my own top layers

trained my top model on bottleneck features and got 72% accuracy

load VGG19 bottom layers to merge with my top layers

merge the 2 models

now let's test the merged model end to end on some predictions

1 answers

solution1
1 2020-12-19 04:29:54

Transfer Learning - Merging my top layers with pretrained model drops accuracy to 0%

Question

my own top layers

trained my top model on bottleneck features and got 72% accuracy

load VGG19 bottom layers to merge with my top layers

merge the 2 models

now let's test the merged model end to end on some predictions

1 answers

solution1 1 2020-12-19 04:29:54

solution1
1 2020-12-19 04:29:54