简体繁体 English

scipy.optimize.least_squares中未知参数的动态范围和成本函数的形式

[英]Dynamic range of unknown parameters and form of cost function in scipy.optimize.least_squares

原文 2018-05-15 20:09:32 5 1 python/ scipy/ mathematical-optimization/ least-squares/ nonlinear-optimization

I am using scipy.optimize.least_squares to solve an interval constrained nonlinear least squares optimization problem. 我正在使用scipy.optimize.least_squares来解决区间约束的非线性最小二乘优化问题。 The form of my particular problem is that of finding a0, a1, b0, and b1 such that the cost function: 我特定问题的形式是找到a0，a1，b0和b1使得成本函数为：

\\sum^N_{n=1} ( g_n - (y_n - b0 e^-(tn/b1)) / a0 e^-(tn/a1) )^2 \\ sum ^ N_ {n = 1}（g_n-（y_n-b0 e ^-（tn / b1））/ a0 e ^-（tn / a1））^ 2

is minimized where g_n, y_n and t_n are known and there are interval constraints on a0, a1, b0, and b1. 在已知g_n，y_n和t_n并且在a0，a1，b0和b1上存在间隔约束的情况下最小化。

The four unknown parameters span approximately four orders of magnitude (eg, a0 = 2e-3, a1 = 30, similar for b0 and b1). 四个未知参数跨越大约四个数量级（例如，a0 = 2e-3，a1 = 30，类似于b0和b1）。 I have heard that a high dynamic range of unknown parameters can be numerically problematic for optimization routines. 我听说，未知参数的动态范围很大，对于优化例程在数值上可能会有问题。

My first question is whether four or so orders of magnitude range would be problematic for scipy.optimize.minimize. 我的第一个问题是，对于scipy.optimize.minimize，四个或四个数量级的范围是否会成为问题。 The routine appears to converge on the data I've applied so far. 该例程似乎收敛于我到目前为止已应用的数据。

My second question relates to the form of the cost function. 我的第二个问题涉及成本函数的形式。 I can equivalently write it as: 我可以等效地写为：

\\sum^N_{n=1} ( g_n - ( 1/a0 e^(tn/a1) y_n - b0/a0 e^-(tn/b1) +tn/a1) / )^2 \\ sum ^ N_ {n = 1}（g_n-（1 / a0 e ^（tn / a1）y_n-b0 / a0 e ^-（tn / b1）+ tn / a1）/）^ 2

= =

\\sum^N_{n=1} ( g_n - ( a0' e^(tn/a1) y_n - b0' e^-(tn*b1')) )^2 \\ sum ^ N_ {n = 1}（g_n-（a0'e ^（tn / a1）y_n-b0'e ^-（tn * b1'）））^ 2

where the new parameters are simple transformations of the original parameters. 其中新参数是原始参数的简单转换。 Is there any advantage to doing this in terms of numerical stability or the avoidance of local minima? 在数值稳定性或避免局部最小值方面，这样做有什么好处吗？ I haven't proven it, but I wonder whether this new cost function would be convex as opposed to the original cost function. 我还没有证明这一点，但是我想知道这个新的成本函数是否会凸出，而不是原始的成本函数。

1 个解决方案

Most solvers are designed for variables in the 1-10 range. 大多数求解器设计用于1-10范围内的变量。 A large range can cause numerical problems, but it is not guaranteed to be problematic. 大范围可能会导致数值问题，但不能保证会出现问题。 Numerical problems sometimes stem from the matrix factorization step of the linear algebra for solving the Newton step, which is more dependent of the magnitude of the derivatives. 数值问题有时源自线性代数求解牛顿步骤的矩阵分解步骤，该步骤更多地取决于导数的大小。 You may also encounter challenges with termination tolerances for values outside the 1-10 range. 对于1-10范围之外的值，您可能还会遇到端接公差的挑战。 Overall, if it looks like it's working, it's probably fine. 总体而言，如果看起来可行，那就很好。 You could get a slightly better answer by normalizing values. 通过将值归一化，您可以获得更好的答案。

Division by a degree of freedom can cause difficulties in three ways: 自由度的划分可以通过三种方式造成困难：

division by zero 被零除
discontinuous derivatives around 0 0附近的不连续导数
very steep derivatives near 0, or very flat derivatives far from 0 接近0的非常陡峭的导数，或者远离0的非常平坦的导数

For these reasons, I would recommend \\sum^N_{n=1} ( g_n - ( a0' e^(tn/a1) y_n - b0' e^-(tn*b1')) )^2. 由于这些原因，我建议\\ sum ^ N_ {n = 1}（g_n-（a0'e ^（tn / a1）y_n-b0'e ^-（tn * b1'）））^ 2。 However, as previously stated, if it's already working it may not be worth the effort to reformulate your problem. 但是，如前所述，如果它已经起作用了，那么重新制定您的问题可能就不值得了。