简体   繁体   English

Tensorflow concat tf.data.Dataset批次

[英]Tensorflow concat tf.data.Dataset Batches

When using tf.data.Dataset is it possible to concatenate batches of datasets in a way such that not the second dataset is concatenated at the end of the first, but such that the first batch of the second dataset is concatenated after the first batch of the second dataset and so on. 当使用tf.data.Dataset时,可以以这样的方式串联一批数据集,使得第二个数据集的第一批之后没有串联,而第二个数据集的第一批之后没有串联。第二个数据集,依此类推。

I tried it as following but this gave me a dataset with length 40, however, I would expect length 80 here. 我尝试如下,但这给了我一个长度为40的数据集,但是,我希望这里的长度为80。

train_data = train_data.batch(40).concatenate(augmentation_data.batch(40))

Not exactly sure what your usecase is, but you might want to concat the tensors of features and labels in the batch separately like this: 不能完全确定用例是什么,但是您可能想要像这样分别在批处理中合并特征和标签的张量:

def concat_batches(x, y):
    features1, labels1 = x
    features2, labels2 = y
    return ({feature: tf.concat([features1[feature], features2[feature]], axis=0) for feature in features1.keys()}, tf.concat([labels1, labels2], axis=0))

Here an example: 这里是一个例子:

dataset = tf.data.Dataset.from_tensor_slices(({"test": [[1], [1], [1], [1]]}, [1, 1, 1, 1]))
b1 = dataset.repeat().batch(3).make_one_shot_iterator().get_next()
dataset2 = tf.data.Dataset.from_tensor_slices(({"test": [[2], [2], [2], [2]]}, [2, 2, 2, 2]))
b2 = dataset2.repeat().batch(3).make_one_shot_iterator().get_next()

b_con = concat_batches(b1, b2) #tensors of batches 1 and 2 have shape (3, 1), features of the concatenated batch (6, 1)

When evaluating the example you will see, that b_con will look like this: 在评估示例时,您将看到b_con如下所示:

({'test': array([[1],
       [1],
       [1],
       [2],
       [2],
       [2]], dtype=int32)}, array([1, 1, 1, 2, 2, 2], dtype=int32))

Hope this helps! 希望这可以帮助!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM