简体   繁体   English

如何在训练连体网络后生成测试三元组数据集的预测

[英]How to generate predictions on testing triplets dataset after training Siamese network

I have a dataset of images and two txt files in which each line contains the id of three pictures, the first one is for training and tells me that the first picture is most similar to the second one than to the third one.我有一个图像数据集和两个 txt 文件,其中每一行包含三张图片的 id,第一张用于训练,并告诉我第一张图片与第二张图片最相似,而不是第三张图片。 The second one is for testing: I have to predict wether the first image is most similar to the first or the second one for each line.第二个用于测试:我必须预测第一张图像是否与每行的第一张或第二张最相似。 To do this I have trained a siamese network utilising triplet loss using as guideline this article: https://keras.io/examples/vision/siamese_network/为此,我使用本文的指导方针训练了一个利用三元组损失的连体网络: https ://keras.io/examples/vision/siamese_network/

After training the network I do not know how to proceed to evaluate my testing dataset, to prepare the data I have done:训练网络后,我不知道如何继续评估我的测试数据集,准备我所做的数据:

with open('test_triplets.txt') as f:
    lines2 = f.readlines()
lines2 = [line.split('\n', 1)[0] for line in lines2]
anchor2 = [line.split()[0] for line in lines2]
pic1 = [line.split()[1] for line in lines2]
pic2  = [line.split()[2] for line in lines2]

anchor2 = ['food/' + item + '.jpg' for item in anchor2]
pic1 = ['food/' + item + '.jpg' for item in pic1]
pic2 = ['food/' + item + '.jpg' for item in pic2]

anchor2_dataset = tf.data.Dataset.from_tensor_slices(anchor2)
pic1_dataset = tf.data.Dataset.from_tensor_slices(pic1)
pic2_dataset = tf.data.Dataset.from_tensor_slices(pic2)

test_dataset = tf.data.Dataset.zip((anchor2_dataset, pic1_dataset, pic2_dataset))
test_dataset = test_dataset.map(preprocess_triplets)
test_dataset = test_dataset.batch(32, drop_remainder=False)
test_dataset = test_dataset.prefetch(8)

I have then tried to utilise a for loop as follows, but the running time is too high since I have around 50000 lines in the txt file.然后我尝试如下使用 for 循环,但运行时间太长,因为我在 txt 文件中有大约 50000 行。

n_images = len(anchor2)
results  = np.zeros((n_images,2))
for i in range(n_images):
    sample = next(iter(test_dataset))
    anchor, positive, negative = sample
    anchor_embedding, positive_embedding, negative_embedding = (
        embedding(resnet.preprocess_input(anchor)),
        embedding(resnet.preprocess_input(positive)),
        embedding(resnet.preprocess_input(negative)),
    )
    cosine_similarity = metrics.CosineSimilarity()

    positive_similarity = cosine_similarity(anchor_embedding, positive_embedding)
    results[i,0] = positive_similarity.numpy()

    negative_similarity = cosine_similarity(anchor_embedding, negative_embedding)
    results[i,1] = negative_similarity.numpy()

How can I do to be able to generate predictions on my testing triplets ?我该如何做才能对我的测试三胞胎产生预测? My objective would be to have a vector [n_testing_triplets x 1] where each line is 1 if the first pic is most similar to the anchor or 0 otherwise.我的目标是有一个向量 [n_testing_triplets x 1] 如果第一张图片与锚点最相似,则每行为 1,否则为 0。

You can stack your images first , then calculate all embedings in parallel like this :您可以先堆叠图像,然后像这样并行计算所有嵌入

import numpy as np
stack = np.stack([anchor0, positive0, negative0, ..., anchor999, positive999, negative999])

# then you calculate all embeding at the same time like this 
embeddings = list(embedding(resnet.preprocess_input(stack)).numpy())

Then you compare the embeding as you want, in a loop :然后,您可以在循环中根据需要比较嵌入:

cosine_similarity = metrics.CosineSimilarity()

positive_similarity = cosine_similarity(embeddings [0] , embeddings [1])
whatever_storage = positive_similarity.numpy()

negative_similarity = cosine_similarity(embeddings [0] , embeddings [2])
whatever_storage  = negative_similarity.numpy()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何准备数据集以训练孪生神经网络 - how to prepare the dataset for the purpose of training a siamese neural network 在Caffe中训练连体网络 - Training Siamese Network in Caffe 添加更多训练数据后,连体网络精度停滞不前 - Siamese network accuracy stuck after adding more training data 为什么我的神经网络在训练后会做出如此不准确的预测? - Why does my neural network make such inaccurate predictions after training? 如何在测试数据集中使用训练数据集的分数 - How to use the fraction of training dataset in the testing dataset 在 CIFAR 10 数据集上使用 VGG16 进行迁移学习:训练和测试精度非常高但预测错误 - Transfer Learning Using VGG16 on CIFAR 10 Dataset: Very High Training and Testing Accuracy But Wrong Predictions 在Python中执行PCA后如何生成预测 - How generate predictions after doing PCA in Python 具有合成数据集的训练网络 - Training network with synthetic dataset 培训SIAMESE网络时面临“任何变量均无梯度”错误 - Facing “No gradients for any variable” Error while training a SIAMESE NETWORK 创建用于训练暹罗网络以进行说话人验证文本依赖的对 - Creating Pairs for Training a Siamese Network for Speaker Verification Text Dependent
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM