简体   繁体   English

Pytorch 的 `autograd` 如何处理非数学函数?

[英]How does Pytorch's `autograd` handle non-mathematical functions?

During the course of my training process, I tend to use a lot of calls to torch.cat() and copying tensors into new tensors.在我的训练过程中,我倾向于使用大量对torch.cat()的调用并将张量复制到新的张量中。 How are these operations handled by autograd ? autograd如何处理这些操作? Is the gradient value affected by these operations?这些操作会影响梯度值吗?

As pointed out in the comments, cat is a mathematical function.正如评论中指出的那样, cat是一个数学函数。 For example we could write the following (special case) definition of cat in more traditional mathematical notation as例如,我们可以用更传统的数学符号将cat的以下(特殊情况)定义写为

在此处输入图片说明

The Jacobian of this function wrt either of its inputs can be expressed as这个函数的雅可比矩阵可以表示为

在此处输入图片说明

Since the Jacobian is well defined you can, of course, apply back-propagation.由于雅可比行列式定义明确,您当然可以应用反向传播。

In reality you generally wouldn't define these operations with such notation, and a general definition of the cat operation used by pytorch in such a way would be cumbersome.实际上,您通常不会使用这种表示法定义这些操作,并且 pytorch 以这种方式使用的 cat 操作的一般定义会很麻烦。

That said, internally autograd uses backward algorithms that take into account the gradients of such "index style" operations just like any other function.也就是说,内部 autograd 使用反向算法,就像任何其他函数一样考虑这种“索引样式”操作的梯度。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM