简体   繁体   English

在 1-2 个 epoch 后具有三元组损失的特征学习产生 100% 的 val 准确度?

[英]Feature learning with triplet loss after 1-2 epochs yields 100% val accuracy?

My NN has to learn image similarity with a custom triplet loss.我的神经网络必须通过自定义三元组损失来学习图像相似度。 The positive image is similar to the anchor, while the negative is not.正面图像类似于锚点,而负面图像则不然。

My task is to predict whether the second image or the third image of an unseen triplet is more similar to the anchor or not.我的任务是预测看不见的三元组的第二张图像或第三张图像是否与锚更相似。

The triplets are given for both train and test sets in the task, so I did not have to mine them or randomly generate them: they are fixed in my task.任务中的训练集和测试集都给出了三元组,因此我不必挖掘它们或随机生成它们:它们在我的任务中是固定的。

---> Idea: To improve my model, I try to use feature learning with Xception layers frozen and adding a Dense layer on top. ---> 想法:为了改进我的 model,我尝试使用冻结 Xception 层的特征学习并在顶部添加一个 Dense 层。

Problem:问题:

When training the below model with Xception layers frozen, after 1-2 epochs it learns to just set all positive images to a very low distance to the anchor and all negative images to a very high distance.在训练下面的 model 并冻结 Xception 层时,经过 1-2 个 epoch 后,它学会了将所有正图像设置为与锚点的距离非常低,将所有负图像设置为非常高的距离。 Hence, the 100% val accuracy.因此,100% 的 val 准确度。

I immediately thought of overfitting but I only have one fully connected layer that I train?我立即想到了过度拟合,但我只有一个完全连接的层可以训练? How can I combat this?我该如何对抗这个? Or is my triplet loss somehow wrongly defined?还是我的三元组损失以某种方式错误定义?

I dont use data augmentation so could that potentially help?我不使用数据增强,所以这可能有帮助吗?

Somehow this happens only when using a pretrained model.不知何故,只有在使用预训练的 model 时才会发生这种情况。 When I use a simple model I get realistic accuracy...当我使用简单的 model 时,我得到了真实的准确度......

What am I missing here?我在这里想念什么?

My triplet loss :我的三胞胎损失

def triplet_loss(y_true, y_pred, alpha = 0.4):
    """
    Implementation of the triplet loss function
    Arguments:
    y_true -- true labels, required when you define a loss in Keras, you don't need it in this function.
    y_pred -- python list containing three objects:
            anchor -- the encodings for the anchor data
            positive -- the encodings for the positive data (similar to anchor)
            negative -- the encodings for the negative data (different from anchor)
    Returns:
    loss -- real number, value of the loss
    """

    total_length = y_pred.shape.as_list()[-1]

    anchor = y_pred[:,0:int(total_length*1/3)]
    positive = y_pred[:,int(total_length*1/3):int(total_length*2/3)]
    negative = y_pred[:,int(total_length*2/3):int(total_length*3/3)]

    # distance between the anchor and the positive
    pos_dist = K.sum(K.square(anchor-positive),axis=1)

    # distance between the anchor and the negative
    neg_dist = K.sum(K.square(anchor-negative),axis=1)

    # compute loss
    basic_loss = pos_dist-neg_dist+alpha
    loss = K.maximum(basic_loss,0.0)

    return loss

Then my model:然后我的 model:

def baseline_model():
    input_1 = Input(shape=(256, 256, 3))
    input_2 = Input(shape=(256, 256, 3))
    input_3 = Input(shape=(256, 256, 3))

    pretrained_model = Xception(include_top=False, weights="imagenet")

    for layer in pretrained_model.layers:
        layer.trainable = False

    x1 = pretrained_model(input_1)
    x2 = pretrained_model(input_2)
    x3 = pretrained_model(input_3)

    x1 = Flatten(name='flatten1')(x1)
    x2 = Flatten(name='flatten2')(x2) 
    x3 = Flatten(name='flatten3')(x3)

    x1 = Dense(128, activation=None,kernel_regularizer=l2(0.01))(x1)
    x2 = Dense(128, activation=None,kernel_regularizer=l2(0.01))(x2)
    x3 = Dense(128, activation=None,kernel_regularizer=l2(0.01))(x3)

    x1 = Lambda(lambda x: K.l2_normalize(x,axis=-1))(x1)
    x2 = Lambda(lambda x: K.l2_normalize(x,axis=-1))(x2)
    x3 = Lambda(lambda x: K.l2_normalize(x,axis=-1))(x3)

    concat_vector = concatenate([x1, x2, x3], axis=-1, name='concat')

    model = Model([input_1, input_2, input_3], concat_vector)

    model.compile(loss=triplet_loss, optimizer=Adam(0.00001), metrics=[accuracy])

    model.summary()

    return model

Fitting my model :安装我的 model

model.fit(
        gen(X_train,batch_size=batch_size),
        steps_per_epoch=13281 // batch_size,
        epochs=10,
        validation_data=gen(X_val,batch_size=batch_size),
        validation_steps=1666 // batch_size,
        verbose=1,
        callbacks=callbacks_list
        )
model.save_weights('try_6.h5') 

Please note that you use different Dense layers for each input (you define 3 different Dense layers. each time you create a new Dense object it generate a new layer, with new parameters, independent of the previous layers you created).请注意,您为每个输入使用不同的 Dense 层(您定义了 3 个不同的 Dense 层。每次创建新的 Dense object 时,它都会生成一个具有新参数的新层,独立于您之前创建的层)。 If the input is consistent, meaning input 1 is always the anchor, input 2 is always the positive, and input 3 is always the negative - it will be super easy for the model to overfit.如果输入一致,这意味着输入 1 始终是锚点,输入 2 始终是正数,输入 3 始终是负数 - model 非常容易过拟合。 What you should probably do is use only a single Dense layer for all 3 inputs.您可能应该做的是对所有 3 个输入仅使用一个 Dense 层。

For example, based on your code you can define the model like this:例如,根据您的代码,您可以像这样定义 model:

pretrained_model = Xception(include_top=False, weights="imagenet")
for layer in pretrained_model.layers:
    layer.trainable = False

general_input = Input(shape=(256, 256, 3))
x = pretrained_model(general_input)
x = Flatten()(x)
x = Dense(128, activation=None,kernel_regularizer=l2(0.01))(x)

base_model = Model([general_input], [x])

input_1 = Input(shape=(256, 256, 3))
input_2 = Input(shape=(256, 256, 3))
input_3 = Input(shape=(256, 256, 3))

x1 = base_model(input_1)
x2 = base_model(input_2)
x3 = base_model(input_3)

# ... continue with your code - normalize, concat, etc.

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 为什么 val_loss 和 val_accuracy 没有在 epochs 中显示 - why val_loss and val_accuracy not showing in epochs tensorflow Triplet_semihard_loss 在多个时期后不会改变 - tensorflow triplet_semihard_loss doesnt change after multiple epochs 在历元之后,Alexnet的val_acc为0,而val_loss为0 - Alexnet val_acc is 0 and val_loss is 0 after Epochs 在 keras 中实现三元组损失的准确性 - Implementing accuracy for triplet loss in keras 验证损失在3个时期后增加,但验证准确性不断提高 - Validation loss increases after 3 epochs but validation accuracy keeps increasing CNN 的加载精度和损失时期 - loading accuracy and loss epochs of CNN 深度学习中的准确度差异 - 时期和最终准确度 - Accuracy Difference in Deep Learning - epochs and final accuracy 为什么我的 val_loss 波动并具有巨大的价值,而 val_categorical_accuracy 在所有时期都或多或少是恒定的? - Why do my val_loss fluctuate and have enormous values while val_categorical_accuracy are more or less constant throughout all epochs? 4000 个 Epoch 后损失增加 - Loss increases after 4000 Epochs 验证准确度 (val_acc) 不会随着时间的推移而改变 - Validation accuracy (val_acc) does not change over the epochs
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM