简体   繁体   English

具有张量函数的Theano梯度

[英]Theano gradient with function on tensors

I have a function that calculates a value of a scalar field on a 3D space, so I feed it 3D tensors for x, y and z coordinates (obtained by numpy.meshgrid) and use elementwise operations everywhere. 我有一个函数来计算3D空间上的标量场的值,因此我为x,y和z坐标(由numpy.meshgrid获得)提供3D张量,并在任何地方使用元素运算。 This works as expected. 这按预期工作。

Now I need to calculate a gradient of the scalar field. 现在我需要计算标量场的梯度。 I've been playing around with theano.tensor.grad and theano.tensor.jacobian and I don't understand how a derivative of elementwise operation is supposed to work. 我一直在玩theano.tensor.gradtheano.tensor.jacobian ,我不明白元素操作的派生是如何工作的。

This is a MWE that I don't understand: 这是一个我不明白的MWE:

import theano.tensor as T 

x, y = T.matrices("xy")

expr = x**2 + y
grad = T.grad(expr[0, 0], x)
print(grad.eval({x: [[1, 2], [1, 2]], y: [[1, 1], [2, 2]]}))

It prints 它打印

[[ 2.  0.]
 [ 0.  0.]]

while I would expect 虽然我期待

[[ 2.  4.]
 [ 2.  4.]]

I also tried with jacobian: 我也尝试过jacobian:

import theano.tensor as T

x, y = T.matrices("xy")

expr = x**2 + y
grad = T.jacobian(expr.flatten(), x)
print(grad.eval({x: [[1, 2], [1, 2]], y: [[1, 1], [2, 2]]}))

which returns 返回

[[[ 2.  0.]
  [ 0.  0.]]

 [[ 0.  4.]
  [ 0.  0.]]

 [[ 0.  0.]
  [ 2.  0.]]

 [[ 0.  0.]
  [ 0.  4.]]]

(the nonzero elements together would give me my expected matrix from the previous example) (非零元素一起会给我我前面例子中的预期矩阵)

Is there some way to get the elmentwise gradients I need? 有没有办法获得我需要的elmentwise渐变?

Can I for example somehow define the function as scalar (three scalars into a scalar) apply it elementwise over the coordinate tensors? 例如,我可以以某种方式将函数定义为标量(标量中的三个标量)在坐标张量上应用元素吗? This way the derivative would also be just a simple scalar and everything would work smoothly. 这样衍生物也只是一个简单的标量,一切都会顺利进行。

The first element expr[0,0] as a cost with respect to x only relates to the first element of x thus the result you are receiving is correct. 第一元件expr[0,0]如参照成本x只涉及的第一个元素x因而您收到该结果是正确的。

The result you expect is produced if you sum the whole expr array. 如果对整个expr数组求和,则会产生预期的结果。 Theano will take care of backward propagating the gradient through the sum Theano将负责通过sum向后传播渐变

import theano.tensor as T 

x, y = T.matrices("xy")

expr = x**2 + y
grad = T.grad(expr.sum(), x)
print(grad.eval({x: [[1, 2], [1, 2]], y: [[1, 1], [2, 2]]}))

prints 版画

[[ 2.  4.]
 [ 2.  4.]]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM