[英]How can PyTorch train NN with only scalar loss?
Let say we have a NN we want to train to predict 3 values from an input.假设我们有一个神经网络,我们想要训练它从输入中预测 3 个值。 We have a set of training data:我们有一组训练数据:
x_train = ((1, 5, 3, 2, 6), (1, 8, 6, 9, 3), ...)
and the targets和目标
y_train = ((25, 32, 0.12), (.125, -5, 8), ...)
How can pytorch do the training if it just computes a scalar as a loss function?如果 pytorch 只是将标量计算为损失 function,它如何进行训练? Why it is not able to compute the loss associated with each output neuron?为什么它无法计算与每个 output 神经元相关的损失? For example if the answer to x_train[0] is (20, 32, 0.12), we dont want want to update the same weights as of the answer was (25, 37, 0.12), right?例如,如果 x_train[0] 的答案是 (20, 32, 0.12),我们不想更新与答案相同的权重是 (25, 37, 0.12),对吧? But in that case, the loss computed with pytorch would be the same, as it would (for the classical MSE loss) mean all errors.但在这种情况下,使用 pytorch 计算的损失将是相同的,因为它(对于经典的 MSE 损失)意味着所有错误。
How pytorch can train correclty a nn without knowing from where the error comes from? pytorch 如何在不知道错误来自何处的情况下训练正确的 nn?
Your question is not related to pytorch but to general NN training.您的问题与 pytorch 无关,而是与一般 NN 培训有关。
In the end every training needs to minimize the loss.最后,每次训练都需要尽量减少损失。 What will be the loss can be many things but in the end it has to be scalar becasue minimize a vector is ambigious.损失可能是很多东西,但最终它必须是标量的,因为最小化向量是模棱两可的。
The NN optimizer not only calculates the loss, but also calculates the gradient of this loss with respect to all the trainable parameters of the NN. NN 优化器不仅计算损失,还计算这种损失相对于 NN 的所有可训练参数的梯度。 So in hand waving (without getting into tons of methods), parameters that has large effect of the loss will move a lot and parameters that hardly affect the loss will hardly change.所以在挥手中(不涉及大量方法),对损失影响大的参数会移动很多,而对损失影响不大的参数几乎不会改变。
This way it kind of know "from where the error comes from"这样它就知道“错误来自哪里”
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.