使用 Keras API，如何在给定批次中批量导入每个 ID 的 K 个实例的图像？

Question

I'm trying to implement batch hard triplet loss, as seen in Section 3.2 of https://arxiv.org/pdf/2004.06271.pdf .我正在尝试实现批量硬三元组丢失，如https://arxiv.org/pdf/2004.06271.pdf的第 3.2 节中所示。

I need to import my images so that each batch has exactly K instances of each ID in a particular batch .我需要导入我的图像，以便每个批次在特定批次中都有每个 ID 的 K 个实例。 Therefore, each batch must be a multiple of K .因此，每个批次必须是 K 的倍数。

I have a directory of images too large to fit into memory and therefore I am using ImageDataGenerator.flow_from_directory() to import the images, but I can't see any parameters for this function to allow the functionality I need.我的图像目录太大而无法放入 memory 中，因此我使用ImageDataGenerator.flow_from_directory()导入图像，但我看不到此 function 的任何参数以允许我需要的功能。

How can I achieve this batch behaviour using Keras?如何使用 Keras 实现此批处理行为？

Answer 1

As of Tensorflow 2.4 I don't see a standard way of doing that with an ImageDataGenerator .从 Tensorflow 2.4 开始，我看不到使用ImageDataGenerator的标准方法。

So I think you need to write your own based on the tensorflow.keras.utils.Sequence class, so you are free to define the batch contents yourself.所以我认为你需要根据tensorflow.keras.utils.Sequence class自己编写，所以你可以自由定义批处理内容。

References:参考：
https://www.tensorflow.org/api_docs/python/tf/keras/utils/Sequence https://www.tensorflow.org/api_docs/python/tf/keras/utils/Sequence
https://stanford.edu/~shervine/blog/keras-how-to-generate-data-on-the-fly https://stanford.edu/~shervine/blog/keras-how-to-generate-data-on-the-fly

Answer 2

You can try merging several data streams together in a controlled manner.您可以尝试以受控方式将多个数据流合并在一起。

Given you have K instances of tf.data.Dataset (does not matter how you instantiate them) that are responsible for supplying training instances of particular IDs, you can concatenate them to get even distribution inside a mini-batch:假设您有 K 个负责提供特定 ID 的训练实例的tf.data.Dataset实例（无论您如何实例化它们），您可以将它们连接起来以在小批量中均匀分布：

ds1 = ...  # Training instances with ID == 1
ds2 = ...  # Training instances with ID == 2
...
dsK = ... # Training instances with ID == K



train_dataset = tf.data.Dataset.zip((ds1, ds2, ..., dsK)).flat_map(concat_datasets).batch(batch_size=N * K)

where the concat_datasets is the merge function:其中concat_datasets是合并 function：

def concat_datasets(*datasets):
    ds = tf.data.Dataset.from_tensors(datasets[0])
    for i in range(1, len(datasets)):
        ds = ds.concatenate(tf.data.Dataset.from_tensors(datasets[i]))
    return ds

使用 Keras API，如何在给定批次中批量导入每个 ID 的 K 个实例的图像？

问题描述

2 个解决方案

解决方案1
0 2021-04-12 09:36:34

解决方案2
0 已采纳 2021-04-12 10:21:13

使用 Keras API，如何在给定批次中批量导入每个 ID 的 K 个实例的图像？

问题描述

2 个解决方案

解决方案1 0 2021-04-12 09:36:34

解决方案2 0 已采纳 2021-04-12 10:21:13

解决方案1
0 2021-04-12 09:36:34

解决方案2
0 已采纳 2021-04-12 10:21:13