简体繁体 English

sklearn 中神经网络预测的波动

[英]Fluctuations of neural network predictions in sklearn

原文 2019-11-19 11:08:51 8 1 python/ scikit-learn/ neural-network

When I fit a sklearn neural network (MLPRegressor) to a very small dataset(50 elements)当我将 sklearn 神经网络（MLPRegressor）拟合到一个非常小的数据集（50 个元素）时 and 18 input features.和 18 个输入功能。 I get a highly fluctuating output each time, I run the neural network.每次运行神经网络时，我都会得到一个波动很大的 output。 The reason is, that the parameters (weights and biases) are initialized each time and the number of inputs apparently is not sufficient to fit the neural network.原因是，参数（权重和偏差）每次都被初始化，输入的数量显然不足以适应神经网络。 I have to stress, that the MLP regressors max iterations is set to 2000 and hence should converge.我必须强调，MLP 回归器的最大迭代次数设置为 2000，因此应该收敛。 It is probably a problem of overfitting, but it fluctuates for different layer - node combinations.这可能是过度拟合的问题，但对于不同的层 - 节点组合会有所波动。 How to tackle this problem?如何解决这个问题？

Edit:编辑：

The error is large compared to a random forrest (0.2 eV), to a linear polynomial fit (0.18 eV) and Gaussian regression (0.17).与随机福雷斯特 (0.2 eV)、线性多项式拟合 (0.18 eV) 和高斯回归 (0.17) 相比，该误差很大。 The RMSE is plotted, the target variable is in the interval [-1,2].绘制 RMSE，目标变量在区间 [-1,2] 内。

The neural network has a small size of 3*2 According to the design principle here , the phenomenon does however not change for other geometries.神经网络的尺寸很小，只有 3*2 根据这里的设计原则，这种现象对于其他几何形状并没有改变。 A hyperparameter search is difficult if the results are fluctuating that much.如果结果波动很大，则超参数搜索很困难。

1 个解决方案

As a non-expert in neural networks but with experience in statistical modelling, which seems like it would apply here: when you say it "should converge," why should it converge?作为一个非神经网络专家，但在统计建模方面有经验，这似乎适用于这里：当你说它“应该收敛”时，它为什么要收敛？ And have you checked how many layers and how many weights/neurons per layer you're using?您是否检查过您正在使用的每层有多少层以及每层有多少权重/神经元？ Or how many total independent parameters?或者总共有多少个独立参数？

Because based on how neural networks usually work, you're trying to find a maximum in a space with some huge number of dimensions, and many many local maxima.因为根据神经网络通常的工作方式，您试图在具有大量维度和许多局部最大值的空间中找到最大值。 On top of that, you don't have enough data for the model to fit well, possibly fewer data points than independent parameters.最重要的是，您没有足够的数据让 model 拟合得很好，可能数据点比独立参数少。 I can't imagine that there are any optimization methods that should reliably converge to the same results over multiple runs in such a situation.我无法想象在这种情况下，有任何优化方法可以在多次运行中可靠地收敛到相同的结果。

In a normal statistical model, the least you would do in this case is use a simpler model, with fewer parameters.在正常的统计 model 中，在这种情况下，您至少要做的是使用更简单的 model，参数更少。 In your case, perhaps the question to ask is whether it's really feasible to fit any kind of MLP regressor to 50 data points?在您的情况下，也许要问的问题是将任何类型的 MLP 回归器拟合到 50 个数据点是否真的可行？

It's hard to answer your question without knowing the form of the data you're trying to fit.在不知道您要拟合的数据形式的情况下，很难回答您的问题。 But with a small dataset, you could try another type of model for whatever you're doing.但是对于一个小数据集，您可以尝试另一种类型的 model，无论您在做什么。 Neural networks only see benefits over other types of models for large datasets.对于大型数据集，神经网络只看到优于其他类型模型的优势。