简体   繁体   English

在 Tensorflow 中迭代构建张量

[英]Iteratively build tensor in Tensorflow

Let's say I have a function that takes a Tensor in input (of a given dimensionality) and returns another Tensor in output.假设我有一个 function 输入一个张量(具有给定维度)并返回 output 中的另一个张量。 I would like to use that function on a batch of inputs and I would like it to return a batch of outputs.我想在一批输入上使用 function,我希望它返回一批输出。 So both the input and the output would have one more dimension.所以输入和 output 都会多维。

I could write a tf.while_loop to execute my function on all the inputs in the batch, but I am unsure on how to store the output of the single elements in the batch.我可以编写一个tf.while_loop来对批处理中的所有输入执行我的 function,但我不确定如何存储批处理中单个元素的 output。 I have an Idea on how to do this that should also clarify what I am trying to do, but I am not sure it would be optimal.我有一个关于如何做到这一点的想法,它也应该澄清我想要做什么,但我不确定它是否是最佳的。

batch = tf.random.uniform([4,3,2]) #batch of size 4 of (3,2) shaped tensors
output = tf.zeros([0,5]) #let's say that the output should be a batch of 4 (4,5) shaped     tensors.
#I will concatenate the single outputs to this tensor and then reshape it
for i in tf.range(len(batch)):
 output = tf.concat((output,MyVeryNiceFunction(batch[i])),0) #MyVeryNiceFunction     returns a (4,5) shaped tensor
output = tf.reshape(output,(4,4,5)) #(batch_size,(shape of tensor))
return output 

This code for sure gives the output I want, but would it allow to parallelize each execution of the loop?这段代码肯定给出了我想要的 output,但它是否允许并行化循环的每次执行? Is there a better way to do this?有一个更好的方法吗? Is there a proper data structure that would allow me to store the output for each loop execution, and then efficiently build the output Tensor from that?是否有适当的数据结构可以让我为每个循环执行存储 output,然后从中有效地构建 output 张量?

In general, iterating over a dimension is very likely to be the wrong approach.一般来说,迭代一个维度很可能是错误的方法。 In TF (and Matlab and Numpy), the goal is vectorization - describing your operations in a way that can touch all elements of the batch at the same time.在 TF(以及 Matlab 和 Numpy)中,目标是矢量化 - 以一种可以同时触及批处理的所有元素的方式描述您的操作。

For example, let's say my dataset is composed of length 2 vectors, and I have a batch of 4 of them.例如,假设我的数据集由长度为 2 的向量组成,并且我有一批 4 个。

data = tf.convert_to_tensor([[1,2], [3,4], [5,6], [7,8]], tf.float32)
>>> data
<tf.Tensor: shape=(4, 2), dtype=float32, numpy=
array([[1., 2.],
       [3., 4.],
       [5., 6.],
       [7., 8.]], dtype=float32)>

If you wanted to add an element to each vector in a vectorized way, adding some kind of statistical analysis such as variance, you'd do this.如果您想以向量化的方式向每个向量添加一个元素,添加某种统计分析(例如方差),您可以这样做。 Notice how you are constantly thinking about tensors shapes and dimensions and how to concat/append tensors.请注意您是如何不断思考张量的形状和尺寸以及如何连接/附加张量的。 It's common to document tensor shapes constantly and even assert them.经常记录张量形状甚至断言它们是很常见的。 Welcome to TF programming.欢迎来到 TF 编程。

vars = tf.math.reduce_variance(data, axis=1, keepdims=True)
tf.debugging.assert_equal(tf.shape(vars), [4, 1])
tf.concat(values=[data, vars], axis=1)


<tf.Tensor: shape=(4, 3), dtype=float32, numpy=
array([[1.  , 2.  , 0.25],
       [3.  , 4.  , 0.25],
       [5.  , 6.  , 0.25],
       [7.  , 8.  , 0.25]], dtype=float32)>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM