具有张量函数的Theano梯度

Question

I have a function that calculates a value of a scalar field on a 3D space, so I feed it 3D tensors for x, y and z coordinates (obtained by numpy.meshgrid) and use elementwise operations everywhere. 我有一个函数来计算3D空间上的标量场的值，因此我为x，y和z坐标（由numpy.meshgrid获得）提供3D张量，并在任何地方使用元素运算。 This works as expected. 这按预期工作。

Now I need to calculate a gradient of the scalar field. 现在我需要计算标量场的梯度。 I've been playing around with theano.tensor.grad and theano.tensor.jacobian and I don't understand how a derivative of elementwise operation is supposed to work. 我一直在玩theano.tensor.grad和theano.tensor.jacobian ，我不明白元素操作的派生是如何工作的。

This is a MWE that I don't understand: 这是一个我不明白的MWE：

import theano.tensor as T 

x, y = T.matrices("xy")

expr = x**2 + y
grad = T.grad(expr[0, 0], x)
print(grad.eval({x: [[1, 2], [1, 2]], y: [[1, 1], [2, 2]]}))

It prints 它打印

[[ 2.  0.]
 [ 0.  0.]]

while I would expect 虽然我期待

[[ 2.  4.]
 [ 2.  4.]]

I also tried with jacobian: 我也尝试过jacobian：

import theano.tensor as T

x, y = T.matrices("xy")

expr = x**2 + y
grad = T.jacobian(expr.flatten(), x)
print(grad.eval({x: [[1, 2], [1, 2]], y: [[1, 1], [2, 2]]}))

which returns 返回

[[[ 2.  0.]
  [ 0.  0.]]

 [[ 0.  4.]
  [ 0.  0.]]

 [[ 0.  0.]
  [ 2.  0.]]

 [[ 0.  0.]
  [ 0.  4.]]]

(the nonzero elements together would give me my expected matrix from the previous example) （非零元素一起会给我我前面例子中的预期矩阵）

Is there some way to get the elmentwise gradients I need? 有没有办法获得我需要的elmentwise渐变？

Can I for example somehow define the function as scalar (three scalars into a scalar) apply it elementwise over the coordinate tensors? 例如，我可以以某种方式将函数定义为标量（标量中的三个标量）在坐标张量上应用元素吗？ This way the derivative would also be just a simple scalar and everything would work smoothly. 这样衍生物也只是一个简单的标量，一切都会顺利进行。

Answer 1

The first element expr[0,0] as a cost with respect to x only relates to the first element of x thus the result you are receiving is correct. 第一元件expr[0,0]如参照成本x只涉及的第一个元素x因而您收到该结果是正确的。

The result you expect is produced if you sum the whole expr array. 如果对整个expr数组求和，则会产生预期的结果。 Theano will take care of backward propagating the gradient through the sum Theano将负责通过sum向后传播渐变

import theano.tensor as T 

x, y = T.matrices("xy")

expr = x**2 + y
grad = T.grad(expr.sum(), x)
print(grad.eval({x: [[1, 2], [1, 2]], y: [[1, 1], [2, 2]]}))

prints 版画

[[ 2.  4.]
 [ 2.  4.]]

具有张量函数的Theano梯度

问题描述

1 个解决方案

解决方案1
2 已采纳 2016-05-28 16:26:19

具有张量函数的Theano梯度

问题描述

1 个解决方案

解决方案1 2 已采纳 2016-05-28 16:26:19

解决方案1
2 已采纳 2016-05-28 16:26:19