Keras 损失值非常高且不减少

Question

Firstly, I know that similar questions have been asked before, but mainly for classification problems.首先，我知道以前也有人问过类似的问题，但主要是分类问题。 Mine is a regression-style problem.我的是回归式问题。

I am trying to train a neural.network using keras to evaluate chess positions using stockfish evaluations.我正在尝试使用 keras 训练一个 neural.network，以使用鳕鱼评估来评估国际象棋的位置。 The input is boards in a (12,8,8) array (representing piece placement for each individual piece) and output is the evaluation in pawns.输入是 (12,8,8) 数组中的棋盘（代表每个棋子的棋子放置），output 是棋子的评估。 When training, the loss stagnates at around 500,000-600,000.训练时，loss 停滞在 500,000-600,000 左右。 I have a little over 12 million boards + evaluations and I train on all the data at once.我有超过 1200 万个板 + 评估，我一次训练所有数据。 The loss function is MSE.损失function是MSE。

This is my current code:这是我当前的代码：

model = Sequential()
model.add(Dense(16, activation = "relu", input_shape = (12, 8, 8)))
model.add(Dropout(0.2))
model.add(Dense(16, activation = "relu"))
model.add(Dense(10, activation = "relu"))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(1, activation = "linear"))
model.compile(optimizer = "adam", loss = "mean_squared_error", metrics = ["mse"])
model.summary()
# model = load_model("model.h5")

boards = np.load("boards.npy")
evals = np.load("evals.npy")
perf = model.fit(boards, evals, epochs = 10).history
model.save("model.h5")
plt.figure(dpi = 600)
plt.title("Loss")
plt.plot(perf["loss"])
plt.show()

This is the output of a previous epoch:这是前一个纪元的 output：

145856/398997 [=========>....................] - ETA: 26:23 - loss: 593797.4375 - mse: 593797.4375

The loss will remain at 570,000-580,000 upon further fitting, which is not ideal.进一步拟合后损失会保持在570,000-580,000，不太理想。 The loss should decrease by a few more orders of magnitude if I am not wrong.如果我没记错的话，损失应该再减少几个数量级。

What is the problem and how can I fix it to make the model learn better?问题是什么？如何解决才能使 model 学习得更好？

Answer 1

I would suspect that your evaluation data contains very big values, like 100000 pawns if one of sides forcefully wins.我怀疑您的评估数据包含非常大的值，例如如果一方强行获胜则有 100000 个棋子。 Than, if your model predicts something like 0 in the same position, then squared error is very high and this pushes MSE high as well.相比之下，如果您的 model 在同一个 position 中预测类似 0 的值，则平方误差非常高，这也会将 MSE 推高。 You might want to check your evaluation data and ensure they are in some limited range like [-20..20].您可能想要检查您的评估数据并确保它们在某个有限范围内，例如 [-20..20]。

Furthermore, evaluating a chess position is a very complex problem.此外，评估国际象棋 position 是一个非常复杂的问题。 It looks like your model has too few parameters for the task.看来您的 model 的任务参数太少。 Possible improvements:可能的改进：

Increase the numbers of neurons in your dense layers (say to 300, 200, 100).增加致密层中的神经元数量（比如 300、200、100）。
Increase the numbers of hidden layers (say to 10).增加隐藏层的数量（比如增加到 10）。
Use convolutional layers.使用卷积层。

Besides this, you might want to create a simple "baseline model" to better evaluate the performance of your neural.network.除此之外，您可能想要创建一个简单的“基线模型”以更好地评估您的神经网络的性能。 This baseline model could be just a python function, which runs on input data and does position evaluation based on material counting (like bishop - 3 pawns, rook - 5 etc.) Than you can run this function on your dataset and see MSE for it.这个基线 model 可能只是一个 python function，它在输入数据上运行，并根据材料计数（如主教 - 3 个棋子，车 - 5 等）进行 position 评估。然后你可以在你的数据集上运行这个 function 并查看 MSE . If your neural.network produces a smaller MSE than this baseline model, than it is really learning some useful patterns.如果你的 neural.network 产生的 MSE 小于这个基线 model，那么它实际上是在学习一些有用的模式。

I also recommend the following book: "Neural Networks For Chess: The magic of deep and reinforcement learning revealed" by Dominik Klein.我还推荐以下书籍：Dominik Klein 撰写的“国际象棋神经网络：揭示深度学习和强化学习的魔力”。 The book contains a description of.network architecture used in AlphaZero chess engine and a neural.network used in Stockfish.本书包含对 AlphaZero 国际象棋引擎中使用的网络架构的描述以及 Stockfish 中使用的神经网络。

Keras 损失值非常高且不减少

问题描述

1 个解决方案

解决方案1
0 2023-01-18 18:33:38

Keras 损失值非常高且不减少

问题描述

1 个解决方案

解决方案1 0 2023-01-18 18:33:38

解决方案1
0 2023-01-18 18:33:38