[英]How does a Tensorflow model calculate loss, if the models output layer has multiple neurons and there is only one value to predict?
I recently noticed, that I can have and use a Tensorflow model, that has any number of outputs even if the predicted value is just a single real value.我最近注意到,我可以拥有并使用Tensorflow model,即使预测值只是一个实数值,它也有任意数量的输出。 For example, I can have 16 outputs from the model and only 1 real target value, however, the model does still train normally with no errors.
例如,我可以从 model 获得 16 个输出,只有 1 个实际目标值,但是,model 仍然可以正常训练,没有错误。
My question is how does Tensorflow handle the incorrect size of the output when comparing the output to the target value.我的问题是,在将 output 与目标值进行比较时,Tensorflow 如何处理 output 的不正确大小。 How can it still calculate loss?
怎么还能计算损失呢? Does it try to get every output as close to the target value as possible, or does it do some kind of averaging?
它是尝试使每个 output 尽可能接近目标值,还是进行某种平均?
We use Categorical cross-entropy loss for a model which has any number of outputs and a single prediction value.我们对 model 使用分类交叉熵损失,它具有任意数量的输出和单个预测值。
Categorical cross-entropy loss = Softmax Activation + Cross-entropy loss分类交叉熵损失 = Softmax 激活 + 交叉熵损失
Lets assume we have 4 classes in our model, the softmax activation function will compute the probabilities for 4 classes.假设我们的 model 中有 4 个类别,softmax 激活 function 将计算 4 个类别的概率。 The Cross-entropy loss is calculated on these probabilities for the true label class.
交叉熵损失是根据真实 label class 的这些概率计算的。
Consider the following Example:考虑以下示例:
The 4 output nodes are Rabbit, Cat, Dog, Squirrel. 4个output节点分别是Rabbit, Cat, Dog, Squirrel。
True Label: Rabbit真 Label:兔子
Predictions: Rabbit = 8, Cat = 1, Dog = 4, Squirrel = 2预测:兔子 = 8,猫 = 1,狗 = 4,松鼠 = 2
Softmax Calculation: R = 2980.95/3045.63, C = 2.71/3045.63, D = 54.59/3045.63, S = 7.38/3045.63 (Here, 3045.63 = 2980.95 + 2.71 + 54.59 + 7.38 = 3045.63) SoftMax计算:R = 2980.95/3045.63,C = 2.71/3045.63,d = 54.59/3045.63,s = 7.38/3045.63
Cross- entropy loss: -(1 * ln(R) + 0 * ln(C) + 0 * ln(D) + 0 * ln(S)) = -(-0.0214 + 0 + 0 + 0) = 0.0214交叉熵损失:-(1 * ln(R) + 0 * ln(C) + 0 * ln(D) + 0 * ln(S)) = -(-0.0214 + 0 + 0 + 0) = 0.0214
This is how the model handles the outputs and calculate the loss.这就是 model 处理输出和计算损失的方式。 Thanks!
谢谢!
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.