简体繁体 English

神经网络扩展学习领域

[英]Neural Networks Extending Learning Domain

原文 2022-12-04 18:58:43 1 1 python/ pytorch

I have a simple function f : R -> R , f(x) = x ² + a, and would like to create a neural.network to learn that function, as entirely as it can.我有一个简单的 function f : R -> R ，f(x) = x ² + a，我想创建一个神经网络来尽可能完整地学习 function。 Currently, I have a pytorch implementation that takes in inputs of a limited range of course, from x0 to xN with a particular number of points.目前，我有一个 pytorch 实现，它当然接受有限范围的输入，从 x0 到 xN 具有特定数量的点。 Each epoch, the training data is randomly perturbed, in efforts to not only learn the relationship on the same grid points each time.每个时期，训练数据都被随机扰动，努力不仅每次都学习相同网格点上的关系。

Currently, it does a great job of learning on the function on the range it is trained on, but is it at all feasible to train in such a way that can extend this learning beyond what it is trained on?目前，它在 function 的训练范围内学习得很好，但是以一种可以将这种学习扩展到训练范围之外的方式进行训练是否可行？ Currently the behavior outside the training range seems dependent on the activation function. For example, with ReLU, the true function (orange) compared to the.networks prediction (blue) are below:目前，训练范围外的行为似乎取决于激活 function。例如，使用 ReLU，与 .networks 预测（蓝色）相比，真实的 function（橙色）如下：

I understand that if I transform the input vector to higher dimensions that contain higher powers of x, it may work out pretty well, but for a generalized case and how I plan to implement this in the future it won't work as well on non-polynomial functions.我知道如果我将输入向量转换为包含更高 x 次方的更高维度，它可能会很好地工作，但对于一般情况以及我计划如何在未来实现它它不会在非-多项式函数。

One thought that came to mind is from support vector machines and the choice of a kernel, and how the radial basis kernel gets around this generalization issue, but I'm not sure if this can be applied here without the inner product properties of svm.想到的一个想法来自支持向量机和 kernel 的选择，以及径向基 kernel 如何解决这个泛化问题，但我不确定这是否可以在没有 svm 的内积属性的情况下应用在这里。

1 个解决方案

What you want is called extrapolation (as opposed to interpolation which is predicting a value that is inside the trained domain / range).您想要的称为外推法（与预测训练域/范围内的值的内插法相反）。 There is never a good solution for extrapolation and using higher powers can give you a better fit for a specific problem, but if you change the fitted curve slightly (either change its x and y-intercept, one of the powers, etc) the extrapolation will be pretty bad again.外推从来没有好的解决方案，使用更高的幂可以更好地解决特定问题，但是如果您稍微更改拟合曲线（改变其 x 和 y 截距，其中一个幂等），则外推又会很糟糕。

This is also why neural.networks use a large data set (to maximize their input range and rely on interpolation) and why over-training / over fitting (which is what you're trying to do) is a bad idea;这也是为什么 neural.networks 使用大数据集（以最大化其输入范围并依赖于插值）以及为什么过度训练/过度拟合（这是你正在尝试做的）是一个坏主意； it never works well in the general case.在一般情况下，它永远无法正常工作。