简体   繁体   English

为什么在准确率不变的情况下损失会减少?

[英]Why would the loss decrease while the accuracy stays the same?

I am training a normal feed-forward network on financial data of the last 90 days of a stock, and I am predicting whether the stock will go up or down on the next day.我正在根据股票最近 90 天的财务数据训练一个正常的前馈网络,并且我正在预测股票在第二天会上涨还是下跌。 I am using binary cross entropy as my loss and standard SGD for the optimizer.我使用二元交叉熵作为优化器的损失和标准 SGD。 When I train, the training and validation loss continue to go down as they should, but the accuracy and validation accuracy stay around the same.当我训练时,训练和验证损失继续下降,但准确度和验证准确度保持不变。

Here's my model:这是我的模型:

_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
dense (Dense)                (None, 90, 256)           1536
_________________________________________________________________
elu (ELU)                    (None, 90, 256)           0
_________________________________________________________________
flatten (Flatten)            (None, 23040)             0
_________________________________________________________________
dropout (Dropout)            (None, 23040)             0
_________________________________________________________________
dense_1 (Dense)              (None, 1024)              23593984
_________________________________________________________________
elu_1 (ELU)                  (None, 1024)              0
_________________________________________________________________
dropout_1 (Dropout)          (None, 1024)              0
_________________________________________________________________
dense_2 (Dense)              (None, 512)               524800
_________________________________________________________________
elu_2 (ELU)                  (None, 512)               0
_________________________________________________________________
dropout_2 (Dropout)          (None, 512)               0
_________________________________________________________________
dense_3 (Dense)              (None, 512)               262656
_________________________________________________________________
elu_3 (ELU)                  (None, 512)               0
_________________________________________________________________
dropout_3 (Dropout)          (None, 512)               0
_________________________________________________________________
dense_4 (Dense)              (None, 256)               131328
_________________________________________________________________
activation (Activation)      (None, 256)               0
_________________________________________________________________
dense_5 (Dense)              (None, 2)                 514
_________________________________________________________________
activation_1 (Activation)    (None, 2)                 0
_________________________________________________________________
Total params: 24,514,818
Trainable params: 24,514,818
Non-trainable params: 0
_________________________________________________________________

I expect that either both losses should decrease while both accuracies increase, or the network will overfit and the validation loss and accuracy won't change much.我希望两种损失都应该减少,而准确度都会增加,或者网络会过度拟合,验证损失和准确度不会有太大变化。 Either way, shouldn't the loss and its corresponding accuracy value be directly linked and move inversely to each other?无论哪种方式,损失及其相应的准确度值不应该直接关联并相互反向移动吗?

Also, I notice that my validation loss is always less than my normal loss, which seems wrong to me.另外,我注意到我的验证损失总是小于我的正常损失,这对我来说似乎是错误的。

Here's the loss (Normal: Blue, Validation: Green)这是损失(正常:蓝色,验证:绿色)

失利

Here's the accuracy (Normal: Black, Validation: Yellow):这是准确性(正常:黑色,验证:黄色):

准确性

Loss and accuracy are indeed connected, but the relationship is not so simple. loss和accuracy确实是有联系的,但关系并不是那么简单。

Loss drops but accuracy is about the same损失下降但准确度大致相同

Let's say we have 6 samples, our y_true could be:假设我们有 6 个样本,我们的y_true可能是:

[0, 0, 0, 1, 1, 1]

Furthermore, let's assume our network predicts following probabilities:此外,假设我们的网络预测以下概率:

[0.9, 0.9, 0.9, 0.1, 0.1, 0.1]

This gives us loss equal to ~24.86 and accuracy equal to zero as every sample is wrong.这使我们的损失等于~24.86 ,准确度为零,因为每个样本都是错误的。

Now, after parameter updates via backprop, let's say new predictions would be:现在,在通过反向传播参数更新之后,假设新的预测是:

[0.6, 0.6, 0.6, 0.4, 0.4, 0.4]

One can see those are better estimates of true distribution (loss for this example is 16.58 ), while accuracy didn't change and is still zero.可以看到这些是对真实分布的更好估计(此示例的损失为16.58 ),而准确度没有改变,仍然为零。

All in all, the relation is more complicated, network could fix its parameters for some examples, while destroying them for other which keeps accuracy about the same.总而言之,这种关系更加复杂,网络可以为某些示例修复其参数,而为其他示例破坏它们,从而保持大致相同的准确性。

Why my network is unable to fit to the data?为什么我的网络无法适应数据?

Such situation usually occurs when your data is really complicated (or incomplete) and/or your model is too weak.当您的数据非常复杂(或不完整)和/或您的模型太弱时,通常会发生这种情况。 Here both are the case, financial data prediction has a lot of hidden variables which your model cannot infer.两种情况都是如此,财务数据预测有很多你的模型无法推断的隐藏变​​量。 Furthermore, dense layers are not the ones for this task;此外,密集层不适合这项任务。 each day is dependent on the previous values, it is a perfect fit for Recurrent Neural Networks , you can find an article about LSTMs and how to use them here (and tons of others over the web).每一天都取决于之前的值,它非常适合循环神经网络,你可以在这里找到一篇关于 LSTM 以及如何使用它们的文章(以及网络上的大量其他文章)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM