Tensorflow concat tf.data.Dataset批次

Question

When using tf.data.Dataset is it possible to concatenate batches of datasets in a way such that not the second dataset is concatenated at the end of the first, but such that the first batch of the second dataset is concatenated after the first batch of the second dataset and so on. 当使用tf.data.Dataset时，可以以这样的方式串联一批数据集，使得第二个数据集的第一批之后没有串联，而第二个数据集的第一批之后没有串联。第二个数据集，依此类推。

I tried it as following but this gave me a dataset with length 40, however, I would expect length 80 here. 我尝试如下，但这给了我一个长度为40的数据集，但是，我希望这里的长度为80。

train_data = train_data.batch(40).concatenate(augmentation_data.batch(40))

Answer 1

Not exactly sure what your usecase is, but you might want to concat the tensors of features and labels in the batch separately like this: 不能完全确定用例是什么，但是您可能想要像这样分别在批处理中合并特征和标签的张量：

def concat_batches(x, y):
    features1, labels1 = x
    features2, labels2 = y
    return ({feature: tf.concat([features1[feature], features2[feature]], axis=0) for feature in features1.keys()}, tf.concat([labels1, labels2], axis=0))

Here an example: 这里是一个例子：

dataset = tf.data.Dataset.from_tensor_slices(({"test": [[1], [1], [1], [1]]}, [1, 1, 1, 1]))
b1 = dataset.repeat().batch(3).make_one_shot_iterator().get_next()
dataset2 = tf.data.Dataset.from_tensor_slices(({"test": [[2], [2], [2], [2]]}, [2, 2, 2, 2]))
b2 = dataset2.repeat().batch(3).make_one_shot_iterator().get_next()

b_con = concat_batches(b1, b2) #tensors of batches 1 and 2 have shape (3, 1), features of the concatenated batch (6, 1)

When evaluating the example you will see, that b_con will look like this: 在评估示例时，您将看到b_con如下所示：

({'test': array([[1],
       [1],
       [1],
       [2],
       [2],
       [2]], dtype=int32)}, array([1, 1, 1, 2, 2, 2], dtype=int32))

Hope this helps! 希望这可以帮助！

Tensorflow concat tf.data.Dataset批次

问题描述

1 个解决方案

解决方案1
3 2018-04-23 22:42:22

Tensorflow concat tf.data.Dataset批次

问题描述

1 个解决方案

解决方案1 3 2018-04-23 22:42:22

解决方案1
3 2018-04-23 22:42:22