如何在训练连体网络后生成测试三元组数据集的预测

Question

I have a dataset of images and two txt files in which each line contains the id of three pictures, the first one is for training and tells me that the first picture is most similar to the second one than to the third one.我有一个图像数据集和两个 txt 文件，其中每一行包含三张图片的 id，第一张用于训练，并告诉我第一张图片与第二张图片最相似，而不是第三张图片。 The second one is for testing: I have to predict wether the first image is most similar to the first or the second one for each line.第二个用于测试：我必须预测第一张图像是否与每行的第一张或第二张最相似。 To do this I have trained a siamese network utilising triplet loss using as guideline this article: https://keras.io/examples/vision/siamese_network/为此，我使用本文的指导方针训练了一个利用三元组损失的连体网络： https ://keras.io/examples/vision/siamese_network/

After training the network I do not know how to proceed to evaluate my testing dataset, to prepare the data I have done:训练网络后，我不知道如何继续评估我的测试数据集，准备我所做的数据：

with open('test_triplets.txt') as f:
    lines2 = f.readlines()
lines2 = [line.split('\n', 1)[0] for line in lines2]
anchor2 = [line.split()[0] for line in lines2]
pic1 = [line.split()[1] for line in lines2]
pic2  = [line.split()[2] for line in lines2]

anchor2 = ['food/' + item + '.jpg' for item in anchor2]
pic1 = ['food/' + item + '.jpg' for item in pic1]
pic2 = ['food/' + item + '.jpg' for item in pic2]

anchor2_dataset = tf.data.Dataset.from_tensor_slices(anchor2)
pic1_dataset = tf.data.Dataset.from_tensor_slices(pic1)
pic2_dataset = tf.data.Dataset.from_tensor_slices(pic2)

test_dataset = tf.data.Dataset.zip((anchor2_dataset, pic1_dataset, pic2_dataset))
test_dataset = test_dataset.map(preprocess_triplets)
test_dataset = test_dataset.batch(32, drop_remainder=False)
test_dataset = test_dataset.prefetch(8)

I have then tried to utilise a for loop as follows, but the running time is too high since I have around 50000 lines in the txt file.然后我尝试如下使用 for 循环，但运行时间太长，因为我在 txt 文件中有大约 50000 行。

n_images = len(anchor2)
results  = np.zeros((n_images,2))
for i in range(n_images):
    sample = next(iter(test_dataset))
    anchor, positive, negative = sample
    anchor_embedding, positive_embedding, negative_embedding = (
        embedding(resnet.preprocess_input(anchor)),
        embedding(resnet.preprocess_input(positive)),
        embedding(resnet.preprocess_input(negative)),
    )
    cosine_similarity = metrics.CosineSimilarity()

    positive_similarity = cosine_similarity(anchor_embedding, positive_embedding)
    results[i,0] = positive_similarity.numpy()

    negative_similarity = cosine_similarity(anchor_embedding, negative_embedding)
    results[i,1] = negative_similarity.numpy()

How can I do to be able to generate predictions on my testing triplets ?我该如何做才能对我的测试三胞胎产生预测？ My objective would be to have a vector [n_testing_triplets x 1] where each line is 1 if the first pic is most similar to the anchor or 0 otherwise.我的目标是有一个向量 [n_testing_triplets x 1] 如果第一张图片与锚点最相似，则每行为 1，否则为 0。

Answer 1

You can stack your images first , then calculate all embedings in parallel like this :您可以先堆叠图像，然后像这样并行计算所有嵌入：

import numpy as np
stack = np.stack([anchor0, positive0, negative0, ..., anchor999, positive999, negative999])

# then you calculate all embeding at the same time like this 
embeddings = list(embedding(resnet.preprocess_input(stack)).numpy())

Then you compare the embeding as you want, in a loop :然后，您可以在循环中根据需要比较嵌入：

cosine_similarity = metrics.CosineSimilarity()

positive_similarity = cosine_similarity(embeddings [0] , embeddings [1])
whatever_storage = positive_similarity.numpy()

negative_similarity = cosine_similarity(embeddings [0] , embeddings [2])
whatever_storage  = negative_similarity.numpy()

如何在训练连体网络后生成测试三元组数据集的预测

问题描述

1 个解决方案

解决方案1
0 2022-07-07 10:41:26

如何在训练连体网络后生成测试三元组数据集的预测

问题描述

1 个解决方案

解决方案1 0 2022-07-07 10:41:26

解决方案1
0 2022-07-07 10:41:26