[英]Python's scipy.optimize.minimize with SLSQP fails with “Positive directional derivative for linesearch”
I have a least squares minimization problem subject to inequality constraints which I am trying to solve using scipy.optimize.minimize. 我有一个最小二乘最小化问题,受不等式约束的影响,我正在尝试使用scipy.optimize.minimize解决。 It seems that there are two options for inequality constraints: COBYLA and SLSQP. 对于不平等约束,似乎有两种选择:COBYLA和SLSQP。
I first tried SLSQP since it allow for explicit partial derivatives of the function to be minimized. 我首先尝试了SLSQP,因为它允许将函数的显式偏导数最小化。 Depending on the scaling of the problem, it fails with error: 根据问题的严重程度,它会因错误而失败:
Positive directional derivative for linesearch (Exit mode 8)
whenever interval or more general inequality constraints are imposed. 只要施加间隔或更普遍的不平等约束。
This has been observed previously eg, here . 以前例如在这里已经观察到了这一点 。 Manual scaling of the function to be minimized (along with the associated partial derivatives) seems to get rid of the problem, but I cannot achieve the same effect by changing ftol in the options. 手动缩放要最小化的功能(以及相关的偏导数)似乎可以解决此问题,但是我无法通过更改选项中的ftol来实现相同的效果。
Overall, this whole thing is causing me to have doubts about the routine working in a robust manner. 总的来说,这件事使我对常规程序的健壮性产生怀疑。 Here's a simplified example: 这是一个简化的示例:
import numpy as np
import scipy.optimize as sp_optimize
def cost(x, A, y):
e = y - A.dot(x)
rss = np.sum(e ** 2)
return rss
def cost_deriv(x, A, y):
e = y - A.dot(x)
deriv0 = -2 * e.dot(A[:,0])
deriv1 = -2 * e.dot(A[:,1])
deriv = np.array([deriv0, deriv1])
return deriv
A = np.ones((10,2)); A[:,0] = np.linspace(-5,5, 10)
x_true = np.array([2, 2/20])
y = A.dot(x_true)
x_guess = x_true / 2
prm_bounds = ((0, 3), (0,1))
cons_SLSQP = ({'type': 'ineq', 'fun' : lambda x: np.array([x[0] - x[1]]),
'jac' : lambda x: np.array([1.0, -1.0])})
# works correctly
min_res_SLSQP = sp_optimize.minimize(cost, x_guess, args=(A, y), jac=cost_deriv, bounds=prm_bounds, method='SLSQP', constraints=cons_SLSQP, options={'disp': True})
print(min_res_SLSQP)
# fails
A = 100 * A
y = A.dot(x_true)
min_res_SLSQP = sp_optimize.minimize(cost, x_guess, args=(A, y), jac=cost_deriv, bounds=prm_bounds, method='SLSQP', constraints=cons_SLSQP, options={'disp': True})
print(min_res_SLSQP)
# works if bounds and inequality constraints removed
min_res_SLSQP = sp_optimize.minimize(cost, x_guess, args=(A, y), jac=cost_deriv,
method='SLSQP', options={'disp': True})
print(min_res_SLSQP)
How should ftol be set to avoid failure? 应该如何设置以避免故障? More generally, can a similar problem arise with COBYLA? 更一般而言,COBYLA是否会出现类似的问题? Is COBYLA a better choice for this type of inequality constrained least squares optimization problem? 对于这种类型的不等式约束最小二乘优化问题,COBYLA是更好的选择吗?
Using a square root in the cost function was found to improve performance. 发现在成本函数中使用平方根可以提高性能。 However, for a non-linear re-paramterization of the problem (simpler but closer to what I need to do in practice), it fails again. 但是,对于问题的非线性重新参数化(更简单,但更接近我在实践中需要做的事情),它再次失败。 Here are the details: 详细信息如下:
import numpy as np
import scipy.optimize as sp_optimize
def cost(x, y, g):
e = ((y - x[1]) / x[0]) - g
rss = np.sqrt(np.sum(e ** 2))
return rss
def cost_deriv(x, y, g):
e = ((y- x[1]) / x[0]) - g
factor = 0.5 / np.sqrt(e.dot(e))
deriv0 = -2 * factor * e.dot(y - x[1]) / (x[0]**2)
deriv1 = -2 * factor * np.sum(e) / x[0]
deriv = np.array([deriv0, deriv1])
return deriv
x_true = np.array([1/300, .1])
N = 20
t = 20 * np.arange(N)
g = 100 * np.cos(2 * np.pi * 1e-3 * (t - t[-1] / 2))
y = g * x_true[0] + x_true[1]
x_guess = x_true / 2
prm_bounds = ((1e-4, 1e-2), (0, .4))
# check derivatives
delta = 1e-9
C0 = cost(x_guess, y, g)
C1 = cost(x_guess + np.array([delta, 0]), y, g)
approx_deriv0 = (C1 - C0) / delta
C1 = cost(x_guess + np.array([0, delta]), y, g)
approx_deriv1 = (C1 - C0) / delta
approx_deriv = np.array([approx_deriv0, approx_deriv1])
deriv = cost_deriv(x_guess, y, g)
# fails
min_res_SLSQP = sp_optimize.minimize(cost, x_guess, args=(y, g), jac=cost_deriv,
bounds=prm_bounds, method='SLSQP', options={'disp': True})
print(min_res_SLSQP)
Instead of minimizing np.sum(e ** 2)
, minimize sqrt(np.sum(e ** 2))
, or better (in terms of calculation): np.linalg.norm(e)
! 代替最小化np.sum(e ** 2)
,最小化sqrt(np.sum(e ** 2))
或更好(就计算而言): np.linalg.norm(e)
!
This modification: 此修改:
x
不会改变关于x
解决方案 With this change, all cases work, even using numerical-differentiation (i was too lazy to modify the gradient, which needs to reflect this!). 有了这一更改,所有情况都可以使用 , 即使使用数值微分也是如此(我太懒了,无法修改梯度,这需要反映这一点!)。
Example output (number of func-evals gives away num-diff): 输出示例(func-evals的数量给出num-diff):
Optimization terminated successfully. (Exit mode 0)
Current function value: 3.815547437029837e-06
Iterations: 16
Function evaluations: 88
Gradient evaluations: 16
fun: 3.815547437029837e-06
jac: array([-6.09663382, -2.48862544])
message: 'Optimization terminated successfully.'
nfev: 88
nit: 16
njev: 16
status: 0
success: True
x: array([ 2.00000037, 0.10000018])
Optimization terminated successfully. (Exit mode 0)
Current function value: 0.0002354577991007501
Iterations: 23
Function evaluations: 114
Gradient evaluations: 23
fun: 0.0002354577991007501
jac: array([ 435.97259208, 288.7483819 ])
message: 'Optimization terminated successfully.'
nfev: 114
nit: 23
njev: 23
status: 0
success: True
x: array([ 1.99999977, 0.10000014])
Optimization terminated successfully. (Exit mode 0)
Current function value: 0.0003392807206384532
Iterations: 21
Function evaluations: 112
Gradient evaluations: 21
fun: 0.0003392807206384532
jac: array([ 996.57340243, 51.19298764])
message: 'Optimization terminated successfully.'
nfev: 112
nit: 21
njev: 21
status: 0
success: True
x: array([ 2.00000008, 0.10000104])
While there are probably some problems with SLSQP, it's still one of the most tested and robust codes given that broad application-spectrum! 尽管SLSQP可能存在一些问题,但鉴于广泛的应用范围,它仍然是最受测试和最强大的代码之一!
I would also expect SLSQP to be much better here compared to COBYLA, as the latter is based heavily on linearizations. 我还希望SLSQP在这里比COBYLA更好,因为COBYLA很大程度上基于线性化。 (but just take it as a guess; it's easy to try given the minimize-interface!) (但仅作为猜测;考虑最小接口,很容易尝试!)
Alternative 替代
In general, an Interior-point based solver for Convex Quadratic Programming will be the best approach here. 通常,用于凸二次规划的基于内部点的求解器将是此处的最佳方法。 But for this, you need to leave scipy. 但是为此,您需要保持科学。 (or maybe an SOCP-solver would be better... i'm not sure). (或者SOCP求解器可能会更好...我不确定)。
cvxpy brings a nice-modelling system and a good open-source solver ( ECOS ; although technically a conic-solver -> more general and less robust; but should beat SLSQP). cvxpy带来了一个很好的建模系统和一个很好的开源求解器( ECOS ;尽管从技术上讲是一个圆锥形求解器->更通用,更不可靠;但是应该胜过SLSQP)。
Using cvxpy and ECOS, this looks like: 使用cvxpy和ECOS,这看起来像:
import numpy as np
import cvxpy as cvx
""" Problem data """
A = np.ones((10,2)); A[:,0] = np.linspace(-5,5, 10)
x_true = np.array([2, 2/20])
y = A.dot(x_true)
x_guess = x_true / 2
prm_bounds = ((0, 3), (0,1))
# problematic case
A = 100 * A
y = A.dot(x_true)
""" Solve """
x = cvx.Variable(len(x_true))
constraints = [x[0] >= x[1]]
for ind, (lb, ub) in enumerate(prm_bounds): # ineffecient -> matrix-based expr better!
constraints.append(x[ind] >= lb)
constraints.append(x[ind] <= ub)
objective = cvx.Minimize(cvx.norm(A*x - y))
problem = cvx.Problem(objective, constraints)
problem.solve(solver=cvx.ECOS, verbose=False)
print(problem.status)
print(problem.value)
print(x.value.T)
# optimal
# -6.67593652593801e-10
# [[ 2. 0.1]]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.