具有两个预训练的ResNet 50的暹罗神经网络-测试模型时的奇怪行为

Question

I built siamese neural network, using Keras lib for it. 我使用Keras lib构建了暹罗神经网络。 My model has two inputs with shape (64,64,3), two pre-trained ResNet-50. 我的模型有两个形状为（64,64,3）的输入，两个预训练的ResNet-50。 Loss function is binary cross entropy. 损失函数是二进制交叉熵。

The model is based on this paper a link 该模型是基于本文的一个链接

During train I have very good trait/val accuracy, about 0.99/0.98, and low loss 0.01/0.05. 在训练过程中，我的性状/ val准确度非常好，大约为0.99 / 0.98，低损耗为0.01 / 0.05。

But when I test my saved model, I get bad results. 但是，当我测试保存的模型时，会得到不好的结果。 The model can't recognize even two the same pictures. 该模型甚至无法识别两个相同的图片。

Also I noticed strange behavior: the greater the number of epochs the result is worse. 我也注意到了奇怪的行为：时期越多，结果越差。 For example, comparing two identical images, trained model with 10 epoch gives prediction: "8.jpg": 0.5180479884147644 but the same model trained with 100 epoch gives "8.jpg": 5.579867080537926E-13 However for 100 epoch I have better train results. 例如，比较两个相同的图像，训练有10个历元的模型给出了预测： “ 8.jpg”：0.5180479884147644，但是训练有100个 历元的相同模型给出了“ 8.jpg”：5.579867080537926E-13但是对于100个历元，我有更好的训练结果。

I've tried different model for CNN: ResNet18, different input shapes, like (224,224,3) or (128,128,3) . 我为CNN尝试了不同的模型：ResNet18，不同的输入形状，例如（224,224,3）或（128,128,3） 。

Also I've triad use not pre-train model, only ResNet50/ResNet18 without pre-trained weights. 此外，我三合会不使用预训练模型，仅使用ResNet50 / ResNet18，而不使用预训练权重。 But I have the same bad results while testing real model. 但是在测试真实模型时，我有同样糟糕的结果。

My code is 我的代码是

def create_base_model(image_shape, dropout_rate, suffix=''):
    I1 = Input(shape=image_shape)
    model = ResNet50(include_top=False, weights='imagenet', input_tensor=I1, pooling=None)
    model.layers.pop()
    model.outputs = [model.layers[-1].output]
    model.layers[-1].outbound_nodes = []

    for layer in model.layers:
        layer.name = layer.name + str(suffix)
        layer.trainable = False

    flatten_name = 'flatten' + str(suffix)

    x = model.output
    x = Flatten(name=flatten_name)(x)
    x = Dense(1024, activation='relu')(x)
    x = Dropout(dropout_rate)(x)
    x = Dense(512, activation='relu')(x)
    x = Dropout(dropout_rate)(x)

    return x, model.input


def create_siamese_model(image_shape, dropout_rate):

    output_left, input_left = create_base_model(image_shape, dropout_rate)
    output_right, input_right = create_base_model(image_shape, dropout_rate, suffix="_2")

    L1_layer = Lambda(lambda tensors: tf.abs(tensors[0] - tensors[1]))
    L1_distance = L1_layer([output_left, output_right])
    L1_prediction = Dense(1, use_bias=True,
                          activation='sigmoid',
                          kernel_initializer=RandomNormal(mean=0.0, stddev=0.001),
                          name='weighted-average')(L1_distance)

    prediction = Dropout(0.2)(L1_prediction)

    siamese_model = Model(inputs=[input_left, input_right], outputs=prediction)

    return siamese_model

siamese_model = create_siamese_model(image_shape=(64, 64, 3),
                                         dropout_rate=0.2)

siamese_model.compile(loss='binary_crossentropy',
                      optimizer=Adam(lr=0.0001),
                      metrics=['binary_crossentropy', 'acc'])
siamese_model.fit_generator(train_gen,
                            steps_per_epoch=1000,
                            epochs=10,
                            verbose=1,
                            callbacks=[checkpoint, tensor_board_callback, lr_reducer, early_stopper, csv_logger],
                            validation_data=validation_data,
                            max_q_size=3)

siamese_model.save('siamese_model.h5')



# and the my prediction
siamese_net = load_model('siamese_model.h5', custom_objects={"tf": tf})

X_1 = [image, ] * len(markers)
batch = [markers, X_1]
result = siamese_net.predict_on_batch(batch)

# I've tried also to check identical images 
markers = [image]
X_1 = [image, ] * len(markers)
batch = [markers, X_1]
result = siamese_net.predict_on_batch(batch)

I have some doubts about my prediction method. 我对我的预测方法有些怀疑。 Could someone please help me to find what is wrong with predictions? 有人可以帮我找到预测的问题吗？

Answer 1

What you are getting is expected. 您将得到什么。 I'm not sure what you mean by 我不确定你是什么意思

Also I noticed strange behavior: the greater the number of epochs the result is worse. 我也注意到了奇怪的行为：时期越多，结果越差。

But the results you shown are valid and expected. 但是您显示的结果是有效的和预期的。 Let's start with what the model is outputting. 让我们从模型的输出开始。 Your model output is (normalized)distance between the first and second inputs. 您的模型输出是第一和第二输入之间的（规范化）距离。 If the inputs are similar, then the distance should be close to zero. 如果输入相似，则距离应接近零。 As number of training step increases the model learns to identify the inputs, ie if the inputs are similar the model learns to output values close to zero, and if the inputs are different the model learns to output values close to one. 随着训练步骤数量的增加，模型将学习识别输入，即，如果输入相似，则模型将学习输出接近零的值，如果输入不同，则模型将学习输出接近一的值。 So, 所以，

... trained model with 10 epoch gives prediction: "8.jpg": 0.5180479884147644 but the same model trained with 100 epoch gives "8.jpg": 5.579867080537926E-13 However for 100 epoch I have better train results. ...训练有10个历元的模型给出了预测：“ 8.jpg”：0.5180479884147644，但是训练有100个历元的相同模型给出了“ 8.jpg”：5.579867080537926E-13但是对于100个历元，我有更好的训练结果。

, confirms that the model has learned that the two inputs are similar and outputs 5.579867080537926E-13 ~ 0 (approximately close to 0). 确认模型已获悉两个输入相似并且输出5.579867080537926E-13 ~ 0 （大约接近0）。

Although the model is performing well, there is one issue I've observed in the model definition:- The output layer is dropout layer. 尽管模型运行良好，但是我在模型定义中发现了一个问题：-输出层是辍学层。 Dropout is not valid output layer. 辍学无效的输出层。 What you are doing by this setting is, randomly with probability 0.2 you are setting the output of the model to be zero. 通过此设置，您正在随机以0.2的概率将模型的输出设置为零。

Let's assume the target variable has 1(the two inputs are different), and model has learnt to identify the images correctly and outputs value close to 1 before the dropout layer. 假设目标变量具有1（两个输入是不同的），并且模型已学会正确识别图像并在退出层之前输出接近1的值。 Let's further assume that the dropout layer has decided to set the output to be zero. 让我们进一步假设辍学层已决定将输出设置为零。 So the model output will be zero. 因此模型输出将为零。 Even though the layers before dropout layer have performed well, because of the dropout layer, they will be penalized. 即使辍学层之前的层表现良好，但由于辍学层，它们将受到惩罚。 If this is not what you are looking then remove the last dropout layer. 如果这不是您要查找的内容，请删除最后一个辍学层。

L1_prediction = Dense(1, use_bias=True,
                    activation='sigmoid',
                    kernel_initializer=RandomNormal(mean=0.0, stddev=0.001),
                    name='weighted-average')(L1_distance)


siamese_model = Model(inputs=[input_left, input_right], outputs=L1_prediction)

However, sometimes this behavior is needed if one want to add noise to the model. 但是，如果要向模型添加噪声，有时需要这种行为。 This has the same effect with randomly altering the target variable when the value is 1. 当值为1时，随机更改目标变量具有相同的效果。

具有两个预训练的ResNet 50的暹罗神经网络-测试模型时的奇怪行为

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-05-27 10:26:45

具有两个预训练的ResNet 50的暹罗神经网络-测试模型时的奇怪行为

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-05-27 10:26:45

解决方案1
1 已采纳 2019-05-27 10:26:45