简体繁体 English

在 Pytorch 上使用 sigmoid 输出进行交叉熵损失

[英]Using sigmoid output for cross entropy loss on Pytorch

原文 2020-09-16 07:07:12 3 1 python/ pytorch/ loss-function/ cross-entropy

I'm trying to modify Yolo v1 to work with my task which each object has only 1 class.我正在尝试修改 Yolo v1 以处理我的任务，其中每个对象只有 1 个类。 (eg: an obj cannot be both cat and dog) （例如：obj 不能既是猫又是狗）

Due to the architecture (other outputs like localization prediction must be used regression) so sigmoid was applied to the last output of the model (f.sigmoid(nearly_last_output)).由于架构（必须使用回归等其他输出，如定位预测），所以 sigmoid 被应用于模型的最后一个输出 (f.sigmoid(nearly_last_output))。 And for classification, yolo 1 also use MSE as loss.而对于分类，yolo 1 也使用 MSE 作为损失。 But as far as I know that MSE sometimes not going well compared to cross entropy for one-hot like what I want.但据我所知，与我想要的 one-hot 的交叉熵相比，MSE 有时表现不佳。

And specific: GT like this: 0 0 0 0 1 (let say we have only 5 classes in total, each only has 1 class so only one number 1 in them, of course this is class 5th in this example)具体：GT 像这样： 0 0 0 0 1 （假设我们总共只有 5 个班级，每个班级只有 1 个班级，所以其中只有一个数字 1，当然在这个例子中这是第 5 级）

and output model at classification part: 0.1 0.1 0.9 0.2 0.1和分类部分的输出模型： 0.1 0.1 0.9 0.2 0.1

I found some suggestion use nn.BCE / nn.BCEWithLogitsLoss but I think I should ask here for more correct since I'm not good at math and maybe I'm wrong somewhere so just ask to learn more and for sure what should I use correctly?我发现一些建议使用nn.BCE / nn.BCEWithLogitsLoss但我想我应该在这里要求更正确，因为我不擅长数学，也许我在某个地方错了所以只是要求了解更多，并确定我应该使用什么正确吗？

1 个解决方案

MSE loss is usually used for regression problem. MSE损失通常用于回归问题。
For binary classification, you can either use BCE or BCEWithLogitsLoss .对于二进制分类，您可以使用BCE或BCEWithLogitsLoss 。 BCEWithLogitsLoss combines sigmoid with BCE loss, thus if there is sigmoid applied on the last layer, you can directly use BCE . BCEWithLogitsLoss结合了 sigmoid 和 BCE loss，因此如果最后一层应用了 sigmoid，则可以直接使用BCE 。
The GT mentioned in your case refers to 'multi-class' classification problem, and the output shown doesn't really correspond to multi-class classification.您的案例中提到的 GT 是指“多类”分类问题，并且显示的输出并不真正对应于multi-class分类。 So, in this case, you can apply a CrossEntropyLoss , which combines softmax and log loss and suitable for 'multi-class' classification problem.因此，在这种情况下，您可以应用CrossEntropyLoss ，它结合了 softmax 和对数损失，适用于“多类”分类问题。