简体繁体 English

TF model 精度在保存和加载后改变

[英]TF model accuracy changed after saving and loading

原文 2019-11-15 02:01:19 6 1 python/ tensorflow

I trained a tf model for classification, and the auc on validate data was 0.759 after 1 epoch.我训练了一个 tf model 进行分类，1 epoch 后验证数据的 auc 为 0.759。

I saved the checkpoint, then restored and continue training with same train data.我保存了检查点，然后恢复并继续使用相同的训练数据进行训练。 The validate auc suddenly jumped to 0.764, and converged to 0.766 before over-fitting. validate auc 突然跳到 0.764，在过拟合之前收敛到 0.766。

And the auc is also 0.766 on test data set.在测试数据集上，auc 也是 0.766。

I wonder how it happened.我想知道它是怎么发生的。 Why TF model accuracy changed after saving and loading?为什么TF model精度在保存和加载后发生变化？

I also try to train 2 epochs at a time, without Save/Load.我也尝试一次训练 2 个 epoch，没有保存/加载。 After the first epoch, the model began to over fit rapidly.在第一个 epoch 之后，model 开始快速过拟合。

1 个解决方案

I wonder how it happened.我想知道它是怎么发生的。 Why TF model accuracy changed after saving and loading?为什么TF model精度在保存和加载后发生变化？

The accuracy increased because you trained it for one more epoch.准确度提高了，因为你又训练了一个 epoch。 Loading and saving has nothing to do with it.加载和保存与它无关。 This is to be expected.这是可以预料的。 And the vice versa is also true.反之亦然。 Your accuracy might also decrease due to overfitting.由于过度拟合，您的准确性也可能会降低。 The number of epochs to train your network is also a hyperparameter that we need to tune.训练网络的 epoch 数也是我们需要调整的超参数。 Early stopping can help in this problem.提前停止可以帮助解决这个问题。 Also, you get a different set of weights everytime you compile a model.此外，每次编译 model 时，都会得到一组不同的权重。 You can fix the random seed to get more reproducible results.您可以修复随机种子以获得更多可重现的结果。

Because of the non deterministic nature of neural networks( meaning you won't always get the same output for the same input, because with random weight initialization you might end up in different minimas), it is standard practice to reset and run your network several times(perhaps 10 times) and take the min, max and stddev of the performance metric.由于神经网络的不确定性（这意味着对于相同的输入，您不会总是得到相同的 output，因为使用随机权重初始化，您可能会以不同的最小值结束），标准做法是重置和运行您的网络几次次（可能是 10 次）并取性能指标的最小值、最大值和标准差。 And low stddev suggests that you have a stable architecture.低 stddev 表明你有一个稳定的架构。 You can expect it to behave similarly everytime you train it.您可以期望它在每次训练时表现相似。