简体   繁体   English

具有两个预训练的ResNet 50的暹罗神经网络-测试模型时的奇怪行为

[英]Siamese neural network with two pre-trained ResNet 50 - strange behavior while testing model

I built siamese neural network, using Keras lib for it. 我使用Keras lib构建了暹罗神经网络。 My model has two inputs with shape (64,64,3), two pre-trained ResNet-50. 我的模型有两个形状为(64,64,3)的输入,两个预训练的ResNet-50。 Loss function is binary cross entropy. 损失函数是二进制交叉熵。

The model is based on this paper a link 该模型是基于本文的一个链接

During train I have very good trait/val accuracy, about 0.99/0.98, and low loss 0.01/0.05. 在训练过程中,我的性状/ val准确度非常好,大约为0.99 / 0.98,低损耗为0.01 / 0.05。

But when I test my saved model, I get bad results. 但是,当我测试保存的模型时,会得到不好的结果。 The model can't recognize even two the same pictures. 该模型甚至无法识别两个相同的图片。

Also I noticed strange behavior: the greater the number of epochs the result is worse. 我也注意到了奇怪的行为:时期越多,结果越差。 For example, comparing two identical images, trained model with 10 epoch gives prediction: "8.jpg": 0.5180479884147644 but the same model trained with 100 epoch gives "8.jpg": 5.579867080537926E-13 However for 100 epoch I have better train results. 例如,比较两个相同的图像,训练有10个历元的模型给出了预测: “ 8.jpg”:0.5180479884147644,但是训练有100个 历元的相同模型给出了“ 8.jpg”:5.579867080537926E-13但是对于100个历元,我有更好的训练结果。

I've tried different model for CNN: ResNet18, different input shapes, like (224,224,3) or (128,128,3) . 我为CNN尝试了不同的模型:ResNet18,不同的输入形状,例如(224,224,3)(128,128,3)

Also I've triad use not pre-train model, only ResNet50/ResNet18 without pre-trained weights. 此外,我三合会不使用预训练模型,仅使用ResNet50 / ResNet18,而不使用预训练权重。 But I have the same bad results while testing real model. 但是在测试真实模型时,我有同样糟糕的结果。

My code is 我的代码是

def create_base_model(image_shape, dropout_rate, suffix=''):
    I1 = Input(shape=image_shape)
    model = ResNet50(include_top=False, weights='imagenet', input_tensor=I1, pooling=None)
    model.layers.pop()
    model.outputs = [model.layers[-1].output]
    model.layers[-1].outbound_nodes = []

    for layer in model.layers:
        layer.name = layer.name + str(suffix)
        layer.trainable = False

    flatten_name = 'flatten' + str(suffix)

    x = model.output
    x = Flatten(name=flatten_name)(x)
    x = Dense(1024, activation='relu')(x)
    x = Dropout(dropout_rate)(x)
    x = Dense(512, activation='relu')(x)
    x = Dropout(dropout_rate)(x)

    return x, model.input


def create_siamese_model(image_shape, dropout_rate):

    output_left, input_left = create_base_model(image_shape, dropout_rate)
    output_right, input_right = create_base_model(image_shape, dropout_rate, suffix="_2")

    L1_layer = Lambda(lambda tensors: tf.abs(tensors[0] - tensors[1]))
    L1_distance = L1_layer([output_left, output_right])
    L1_prediction = Dense(1, use_bias=True,
                          activation='sigmoid',
                          kernel_initializer=RandomNormal(mean=0.0, stddev=0.001),
                          name='weighted-average')(L1_distance)

    prediction = Dropout(0.2)(L1_prediction)

    siamese_model = Model(inputs=[input_left, input_right], outputs=prediction)

    return siamese_model

siamese_model = create_siamese_model(image_shape=(64, 64, 3),
                                         dropout_rate=0.2)

siamese_model.compile(loss='binary_crossentropy',
                      optimizer=Adam(lr=0.0001),
                      metrics=['binary_crossentropy', 'acc'])
siamese_model.fit_generator(train_gen,
                            steps_per_epoch=1000,
                            epochs=10,
                            verbose=1,
                            callbacks=[checkpoint, tensor_board_callback, lr_reducer, early_stopper, csv_logger],
                            validation_data=validation_data,
                            max_q_size=3)

siamese_model.save('siamese_model.h5')



# and the my prediction
siamese_net = load_model('siamese_model.h5', custom_objects={"tf": tf})

X_1 = [image, ] * len(markers)
batch = [markers, X_1]
result = siamese_net.predict_on_batch(batch)

# I've tried also to check identical images 
markers = [image]
X_1 = [image, ] * len(markers)
batch = [markers, X_1]
result = siamese_net.predict_on_batch(batch)

I have some doubts about my prediction method. 我对我的预测方法有些怀疑。 Could someone please help me to find what is wrong with predictions? 有人可以帮我找到预测的问题吗?

What you are getting is expected. 您将得到什么。 I'm not sure what you mean by 我不确定你是什么意思

Also I noticed strange behavior: the greater the number of epochs the result is worse. 我也注意到了奇怪的行为:时期越多,结果越差。

But the results you shown are valid and expected. 但是您显示的结果是有效的和预期的。 Let's start with what the model is outputting. 让我们从模型的输出开始。 Your model output is (normalized)distance between the first and second inputs. 您的模型输出是第一和第二输入之间的(规范化)距离。 If the inputs are similar, then the distance should be close to zero. 如果输入相似,则距离应接近零。 As number of training step increases the model learns to identify the inputs, ie if the inputs are similar the model learns to output values close to zero, and if the inputs are different the model learns to output values close to one. 随着训练步骤数量的增加,模型将学习识别输入,即,如果输入相似,则模型将学习输出接近零的值,如果输入不同,则模型将学习输出接近一的值。 So, 所以,

... trained model with 10 epoch gives prediction: "8.jpg": 0.5180479884147644 but the same model trained with 100 epoch gives "8.jpg": 5.579867080537926E-13 However for 100 epoch I have better train results. ...训练有10个历元的模型给出了预测:“ 8.jpg”:0.5180479884147644,但是训练有100个历元的相同模型给出了“ 8.jpg”:5.579867080537926E-13但是对于100个历元,我有更好的训练结果。

, confirms that the model has learned that the two inputs are similar and outputs 5.579867080537926E-13 ~ 0 (approximately close to 0). 确认模型已获悉两个输入相似并且输出5.579867080537926E-13 ~ 0 (大约接近0)。

Although the model is performing well, there is one issue I've observed in the model definition:- The output layer is dropout layer. 尽管模型运行良好,但是我在模型定义中发现了一个问题:-输出层是辍学层。 Dropout is not valid output layer. 辍学无效的输出层。 What you are doing by this setting is, randomly with probability 0.2 you are setting the output of the model to be zero. 通过此设置,您正在随机以0.2的概率将模型的输出设置为零。

Let's assume the target variable has 1(the two inputs are different), and model has learnt to identify the images correctly and outputs value close to 1 before the dropout layer. 假设目标变量具有1(两个输入是不同的),并且模型已学会正确识别图像并在退出层之前输出接近1的值。 Let's further assume that the dropout layer has decided to set the output to be zero. 让我们进一步假设辍学层已决定将输出设置为零。 So the model output will be zero. 因此模型输出将为零。 Even though the layers before dropout layer have performed well, because of the dropout layer, they will be penalized. 即使辍学层之前的层表现良好,但由于辍学层,它们将受到惩罚。 If this is not what you are looking then remove the last dropout layer. 如果这不是您要查找的内容,请删除最后一个辍学层。

L1_prediction = Dense(1, use_bias=True,
                    activation='sigmoid',
                    kernel_initializer=RandomNormal(mean=0.0, stddev=0.001),
                    name='weighted-average')(L1_distance)


siamese_model = Model(inputs=[input_left, input_right], outputs=L1_prediction)

However, sometimes this behavior is needed if one want to add noise to the model. 但是,如果要向模型添加噪声,有时需要这种行为。 This has the same effect with randomly altering the target variable when the value is 1. 当值为1时,随机更改目标变量具有相同的效果。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Tensorflow:加载预先训练的ResNet模型时出错 - Tensorflow: Error while loading pre-trained ResNet model 重新训练预先训练好的ResNet-50型号,使用tf slim进行分类 - Re-train pre-trained ResNet-50 model with tf slim for classification purposes Pytorch预训练的RESNET18模型 - Pytorch Pre-trained RESNET18 Model 如何在Keras中加载卷积神经网络前几层的权重并删除预训练的model? - How to load the weights of the first few layers of Convolutional Neural Network in Keras and delete the pre-trained model? 为什么使用 ResNet50 架构的连体网络比从头训练的网络效果差? - Why siamese network using ResNet50 architecture has worse results than network trained from beginning? 如何在具有不同图像尺寸的预训练ResNet50上进行转学习 - How to do transfer learning on a pre-trained ResNet50 with different image size 如何使用 Pytorch 中的预训练权重修改具有 4 个通道作为输入的 resnet 50? - how to modify resnet 50 with 4 channels as input using pre-trained weights in Pytorch? TypeError:无法散列的类型:'list'在Tensorflow中加载经过预训练的ResNet时? - TypeError: unhashable type: 'list' while loading pre-trained ResNet in Tensorflow? 加载预训练的 model 时面临的问题 - Facing issue while loading the pre-trained model 在Windows中的Tensorflow上加载经过训练的ResNet 50网络 - Loading Trained ResNet 50 network on Tensorflow in Windows
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM