简体   繁体   English

PyTorch 在优化三元组损失时如何计算反向传递?

[英]How does PyTorch compute the backward pass when optimizing triplet loss?

I am implementing a triplet network in Pytorch where the 3 instances (sub-networks) share the same weights.我正在 Pytorch 中实现一个三元组网络,其中 3 个实例(子网络)共享相同的权重。 Since the weights are shared, I implemented it as a single instance network that is called three times to produce the anchor, positive, and negative embeddings.由于权重是共享的,我将其实现为单实例网络,该网络被调用 3 次以生成锚、正和负嵌入。 The embeddings are learned by optimizing the triplet loss .通过优化三元组损失来学习嵌入。 Here is a small snippet for illustration:这是一个小片段用于说明:

from dependencies import *
model = SingleSubNet() # represents each instance in the triplet net

for epoch in epochs:
        for anch, pos, neg in enumerate(train_loader):
                optimizer.zero_grad()
                fa, fp, fn = model(anch), model(pos), model(neg)
                loss = triplet_loss(fa, fp, fn)
                loss.backward()
                optimizer.step()
                # Do more stuff ...

My complete code works as expected.我的完整代码按预期工作。 However, I do not understand what does the loss.backward() compute the gradient(s) in this case.但是,在这种情况下,我不明白loss.backward()是如何计算梯度的。 I am confused because there are 3 gradients of loss is in each learning step (the gradients formulas are here ).我很困惑,因为在每个学习步骤中都有 3 个损失梯度(梯度公式在这里)。 I assume the gradients are summed before performing optimizer.step() .我假设在执行optimizer.step()之前对梯度求和。 But then it looks from the equations that if the gradients are summed, they will cancel each other out and yield zero update term.但是从方程中可以看出,如果将梯度相加,它们将相互抵消并产生零更新项。 Of course, this is not true as the network learns meaningful embeddings at the end.当然,这不是真的,因为网络最终会学习有意义的嵌入。

Thanks in advance提前致谢

Late answer, but hope this helps someone.迟到的答案,但希望这对某人有所帮助。 The gradients that you linked are the gradients of the loss with respect to the embeddings (the anchor, positive embedding and negative embedding).您链接的梯度是损失相对于嵌入(锚、正嵌入和负嵌入)的梯度。 To update the model parameters, you use the gradient of the loss with respect to the model parameters.要更新 model 参数,请使用相对于 model 参数的损失梯度。 This does not sum to zero.这总和不为零。

The reason for this is that when calculating the gradient of the loss with respect to the model parameters, the formula makes use of the activations from the forward pass , and the 3 different inputs (anchor image, positive example and negative example) have different activations in the forward pass.原因是在计算关于 model 参数的损失梯度时,该公式利用了前向传递的激活,并且 3 个不同的输入(锚图像、正例和负例)具有不同的激活在前传中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 当指定“retain_graph=True”时,PyTorch 的 loss.backward() 是如何工作的? - How does PyTorch's loss.backward() work when “retain_graph=True” is specified? Pytorch 向后不计算请求变量的梯度 - Pytorch backward does not compute the gradients for requested variables pytorch 中的 loss.backward() 在使用 GPU 时停止响应 - loss.backward() in pytorch stops responding when using GPU 如何调用“tfa.losses.triplet_semihard_loss”? - How does "tfa.losses.triplet_semihard_loss" get called? pytorch:来自两个网络的损失如何表现? - pytorch: How does loss behave when coming from two networks? pytorch 如何计算简单函数的导数? - How does pytorch compute derivatives for simple functions? 在 Pytorch 中将 loss 乘以 n 后,多次 loss.backward() 和 loss.backward() 之间有区别吗? - Is there a difference between multiple times loss.backward() and loss.backward() after multiplying loss by n in Pytorch? 如何在我的项目中实现三元组损失? - How to implement triplet loss in my project? 如何在Keras中实现三重损失? - How do I implement the Triplet Loss in Keras? 在pytorch中为多个损失计算`loss.backward`是否有效? - Is computing `loss.backward` for multiple losses performant in pytorch?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM