[英]Transfer Learning - Merging my top layers with pretrained model drops accuracy to 0%
My goal here is to attach my top layers to a pre-trained model like VGG19 and make some prediction using the merged model.我的目标是将我的顶层附加到像 VGG19 这样的预训练模型,并使用合并模型进行一些预测。 The merged model has 0 accuracy.合并后的模型精度为 0。 Need a bit of help.需要一点帮助。
from keras.layers import Conv2D, MaxPooling2D, GlobalAveragePooling2D
from keras.layers import Dropout, Flatten, Dense
from keras.models import Sequential
vgg19top_model = Sequential()
vgg19top_model.add(GlobalAveragePooling2D(input_shape=train_vgg19.shape[1:])) # shape=(7, 7, 512)
vgg19top_model.add(Dense(255, activation='relu'))
vgg19top_model.add(Dropout(0.35))
vgg19top_model.add(Dense(133, activation='softmax'))
vgg19top_model.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
global_average_pooling2d_1 ( (None, 512) 0
_________________________________________________________________
dense_1 (Dense) (None, 255) 130815
_________________________________________________________________
dropout_1 (Dropout) (None, 255) 0
_________________________________________________________________
dense_2 (Dense) (None, 133) 34048
=================================================================
Total params: 164,863
Trainable params: 164,863
Non-trainable params: 0
reloading those weights here在这里重新加载这些重量
code not shown代码未显示
from keras import applications
vgg19=applications.vgg19.VGG19(include_top=False, weights='imagenet',input_shape=(224, 224, 3))
vgg19.summary()
Layer (type) Output Shape Param #
=================================================================
input_3 (InputLayer) (None, 224, 224, 3) 0
_________________________________________________________________
...
...
_________________________________________________________________
block5_conv4 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
block5_pool (MaxPooling2D) (None, 7, 7, 512) 0
=================================================================
Total params: 20,024,384
Trainable params: 20,024,384
Non-trainable params: 0
from keras.layers import Input, Dense
from keras.models import Model
global_average_pooling2d_7 = vgg19.get_layer('block5_pool') # shape=(?, 7, 7, 512)
bn_conv1_model = Model(inputs=vgg19.input, outputs=global_average_pooling2d_7.output)
new_model = Sequential()
new_model.add(bn_conv1_model)
new_model.add(vgg19top_model)
new_model.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
model_12 (Model) <-VGG19 (None, 7, 7, 512) 20024384
_________________________________________________________________
sequential_6 (Sequential) (None, 133) 164863
=================================================================
Total params: 20,189,247
Trainable params: 164,863
Non-trainable params: 20,024,384
and it completely fails with 0% accuracy它以 0% 的准确率完全失败
How can I test this new model end to end - or rather why are its predictions so bad?我该如何端到端地测试这个新模型——或者更确切地说,为什么它的预测如此糟糕?
I think, from how you do it, you are stacking two VGG19 models, but the first one only some layers of the VGG19.我认为,从你的做法来看,你堆叠了两个 VGG19 模型,但第一个只是 VGG19 的一些层。
That is not the best way to improve your accuracy.这不是提高准确性的最佳方法。 First, it will just increase the parameters of the network because you combine the models sequentially, the computation will be so heavy.首先,它只会增加网络的参数,因为您按顺序组合模型,计算量会很大。 Second, it will not improve the accuracy because the first model will take features of the low-level as well as the higher level, but more unessecary complex features are extracted by the next CNN layers from the second model.其次,它不会提高准确率,因为第一个模型将采用低层和高层的特征,但是接下来的 CNN 层从第二个模型中提取了更多无关紧要的复杂特征。
You might want to try another method like Siamese CNN, or bidirectional CNN (BCNN).您可能想尝试另一种方法,如 Siamese CNN 或双向 CNN (BCNN)。 The idea is that we feed the train set to two CNN models, and then we merge the output of these two CNN.这个想法是我们将训练集提供给两个 CNN 模型,然后我们合并这两个 CNN 的输出。 It is proved that this method can extract variant of features of input images.证明该方法可以提取输入图像的特征变量。
You can visit this website for more information about BCNN http://vis-www.cs.umass.edu/bcnn/你可以访问这个网站了解更多关于 BCNN 的信息http://vis-www.cs.umass.edu/bcnn/
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.