每个张量的Tensorflow梯度

Question

I have a network that takes as input an Nx3 matrix and produces an N-dimensional vector. 我有一个网络，将Nx3矩阵作为输入并产生N维向量。 Let's say batch size is 1 and N=1024, so the output would have the shape (1,1024) . 假设批次大小为1且N = 1024，所以输出的形状为(1,1024) 。 I want to compute the gradients for every dimension of the output, with respect to the input. 我想针对输入针对输出的每个维度计算渐变。 That is, dy/dx for every y . 也就是说，每y为dy/dx 。 However tensorflow's tf.gradients computes d sum(y)/dx , aggregate. 但是，张量流的tf.gradients计算d sum(y)/dx聚合。 I know there's no straightforward way to compute the gradients for every output dimension, so I finally decided to run tf.gradients 1024 times, because I only have to do this once in the project, and never again. 我知道没有一种简单的方法可以计算每个输出维度的梯度，因此我最终决定将tf.gradients运行1024次，因为我只需要在项目中执行一次，就不必再执行一次。

So I do this: 所以我这样做：

start = datetime.datetime.now()
output_code_split = tf.split(output_code,1024)
#output shape = (1024,)
grad_ops = []
for i in range(1024):
    gr = tf.gradients(output_code_split[i],input)
    #output shape = (1024,1,16,1024,3) , where 16= batch size

    gr = tf.reduce_mean(gr,[0,1,2,3])
    #output shape = (1024,)

    grad_ops.append(gr)
    present = datetime.datetime.now()
    print(i,(present-start).seconds,flush=True)
    #prints time taken to finish previous computation.
    start = datetime.datetime.now()

When the code started running, the time between two iterations was 4 seconds, so I figured it'll run for roughly 4096 seconds. 当代码开始运行时，两次迭代之间的时间为4秒，因此我认为它将运行大约4096秒。 However, as the number of iterations increase, the time taken for subsequent runs keeps increasing. 但是，随着迭代次数的增加，后续运行所花费的时间也不断增加。 The gap, which was 4 seconds when the code started, eventually got to 30 seconds after about 500 iterations, which is too much. 间隔是在代码启动时为4秒，在大约500次迭代后最终达到30秒，这太大了。

Is the list holding the gradient ops grad_ops growing bigger and occupying more memory. 包含渐变操作grad_ops的列表是否变大并且占用更多内存？ I'm unfortunately not in the position to do a detailed memory profiling of this code..Any ideas about what causes the iteration time to blow up as time goes on? 不幸的是，我无法对此代码进行详细的内存分析。有关导致迭代时间随时间推移而耗费时间的任何想法？

(Note that in the code, I'm only creating the gradient ops and not actually evaluating them. That part comes later, but my code doesn't reach there on account of the extreme slowdown mentioned above) （请注意，在代码中，我仅创建渐变操作而不是实际对其进行评估。该部分稍后介绍，但由于上述极端减速，我的代码未到达该位置）

Thanks. 谢谢。

Answer 1

What blows up your execution time is that you define a new operation on the graph in every iteration of your for loop. 浪费执行时间的是，您在for循环的每次迭代中都在图上定义了一个新操作。 Every call to tf.gradient and tf.reduce_mean pushes a new node onto the graph. 每次对tf.gradient和tf.reduce_mean调用tf.gradient在图上推送一个新节点。 Then it needs to recompile to be run. 然后需要重新编译才能运行。 What should actually work for you is to use tf.gather with an int32 placeholder, which supplies the dimension to your gradient operation. 真正适合您的是将tf.gather与int32占位符一起使用，该占位符为渐变操作提供尺寸。 So something like this: 所以像这样：

idx_placeholder = tf.placeholder(tf.int32, shape=(None,))
grad_operation = tf.gradients(tf.gather(output_code_split, idx_placeholder))
for i in range(1024):
      sess.run(grad_operation, {idx_placeholder: np.array([i])})

每个张量的Tensorflow梯度

问题描述

1 个解决方案

解决方案1
0 2018-11-10 20:40:52

每个张量的Tensorflow梯度

问题描述

1 个解决方案

解决方案1 0 2018-11-10 20:40:52

解决方案1
0 2018-11-10 20:40:52