将一个 tf.data.Datasets 与另一个的所有其他元素合并

Question

I would like to merge two tf.data.Dataset so that only every other sample of the first one is combined with the other, without any sample lost.我想合并两个tf.data.Dataset以便只有第一个的每个其他样本与另一个样本合并，而不会丢失任何样本。

For example, let's have two lists of numbers:例如，让我们有两个数字列表：

ds1 = tf.data.Dataset.range(10)
ds10 = tf.data.Dataset.range(10, 60, 10)

I want to combine them so that samples from the second are added to the first, but only every other time:我想将它们组合起来，以便将来自第二个的样本添加到第一个，但只能每隔一次：

0, 11, 2, 23, 4, 35, 6, 47, 8, 59

There is a zip method that enables to merge two datasets, but it does so by drawing a sample from each -- not combining samples would mean dropping a sample from ds10 , which is not what I want.有一个zip方法可以合并两个数据集，但它是通过从每个数据集中抽取一个样本来实现的——不合并样本意味着从ds10中删除一个样本，这不是我想要的。

I could continue from there, and zipping ds10 with "dummy" samples that are dropped during the zip with ds1 , but it doesn't look very efficient.我可以从那里继续，并使用在ds1期间丢弃的“虚拟”样本压缩ds10 ，但它看起来效率不高。

Is there an efficient way to do that, without dropping samples (either real or "dummy")?有没有一种有效的方法来做到这一点，而不会丢弃样本（真实的或“虚拟的”）？

Answer 1

Try this:尝试这个：

def combine(pair,to_add):
    combined = [pair[0], pair[1] + to_add]
    return tf.data.Dataset.from_tensor_slices(combined)

ds1 = tf.data.Dataset.range(10)
ds2 = tf.data.Dataset.range(10,60,10)

combined = tf.data.Dataset.zip((ds1.batch(2),ds2)).flat_map(combine)

Explanation:解释：

First, batch ds1.batch(2) .This produces [(0,1), (2,3), ...] .首先，批处理ds1.batch(2) 。这会产生[(0,1), (2,3), ...] 。
Zip this to the other dataset to get [((0,1),10), ((2,3),20), ...] . Zip 这个到另一个数据集得到[((0,1),10), ((2,3),20), ...] 。
Undo the batching with flat_map and in the process combine every (a,b) with c in each [((a,b),c), ...] like [(a,b+c), ...] .使用flat_map撤消批处理，并在此过程中将每个[((a,b),c), ...]中的每个(a,b)与c结合起来，例如[(a,b+c), ...] 。
The result is then flattened to remove the braces and you get [0, 11, 2, 23, 4, 35, 6, 47, 8, 59] .然后将结果展平以移除大括号，然后得到[0, 11, 2, 23, 4, 35, 6, 47, 8, 59] 。
Batching and unbatching like this is a common pattern when dealing with several tf.data.Dataset s.在处理多个tf.data.Dataset时，像这样的批处理和取消批处理是一种常见的模式。

将一个 tf.data.Datasets 与另一个的所有其他元素合并

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-08-21 16:10:04

将一个 tf.data.Datasets 与另一个的所有其他元素合并

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-08-21 16:10:04

解决方案1
1 已采纳 2020-08-21 16:10:04