[英]Theano gradient with function on tensors
I have a function that calculates a value of a scalar field on a 3D space, so I feed it 3D tensors for x, y and z coordinates (obtained by numpy.meshgrid) and use elementwise operations everywhere. 我有一个函数来计算3D空间上的标量场的值,因此我为x,y和z坐标(由numpy.meshgrid获得)提供3D张量,并在任何地方使用元素运算。 This works as expected. 这按预期工作。
Now I need to calculate a gradient of the scalar field. 现在我需要计算标量场的梯度。 I've been playing around with theano.tensor.grad
and theano.tensor.jacobian
and I don't understand how a derivative of elementwise operation is supposed to work. 我一直在玩theano.tensor.grad
和theano.tensor.jacobian
,我不明白元素操作的派生是如何工作的。
This is a MWE that I don't understand: 这是一个我不明白的MWE:
import theano.tensor as T
x, y = T.matrices("xy")
expr = x**2 + y
grad = T.grad(expr[0, 0], x)
print(grad.eval({x: [[1, 2], [1, 2]], y: [[1, 1], [2, 2]]}))
It prints 它打印
[[ 2. 0.]
[ 0. 0.]]
while I would expect 虽然我期待
[[ 2. 4.]
[ 2. 4.]]
I also tried with jacobian: 我也尝试过jacobian:
import theano.tensor as T
x, y = T.matrices("xy")
expr = x**2 + y
grad = T.jacobian(expr.flatten(), x)
print(grad.eval({x: [[1, 2], [1, 2]], y: [[1, 1], [2, 2]]}))
which returns 返回
[[[ 2. 0.]
[ 0. 0.]]
[[ 0. 4.]
[ 0. 0.]]
[[ 0. 0.]
[ 2. 0.]]
[[ 0. 0.]
[ 0. 4.]]]
(the nonzero elements together would give me my expected matrix from the previous example) (非零元素一起会给我我前面例子中的预期矩阵)
Is there some way to get the elmentwise gradients I need? 有没有办法获得我需要的elmentwise渐变?
Can I for example somehow define the function as scalar (three scalars into a scalar) apply it elementwise over the coordinate tensors? 例如,我可以以某种方式将函数定义为标量(标量中的三个标量)在坐标张量上应用元素吗? This way the derivative would also be just a simple scalar and everything would work smoothly. 这样衍生物也只是一个简单的标量,一切都会顺利进行。
The first element expr[0,0]
as a cost with respect to x
only relates to the first element of x
thus the result you are receiving is correct. 第一元件expr[0,0]
如参照成本x
只涉及的第一个元素x
因而您收到该结果是正确的。
The result you expect is produced if you sum the whole expr
array. 如果对整个expr
数组求和,则会产生预期的结果。 Theano will take care of backward propagating the gradient through the sum
Theano将负责通过sum
向后传播渐变
import theano.tensor as T
x, y = T.matrices("xy")
expr = x**2 + y
grad = T.grad(expr.sum(), x)
print(grad.eval({x: [[1, 2], [1, 2]], y: [[1, 1], [2, 2]]}))
prints 版画
[[ 2. 4.]
[ 2. 4.]]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.