[英]scipy.optimize.minimize('SLSQP') too slow when given 2000 dim variable
I have a non-lenear optimization problem with a constraint and upper/lower bounds, so with scipy I have to use SLSQP.我有一个带有约束和上限/下限的非线性优化问题,所以对于 scipy 我必须使用 SLSQP。 The problem is clearly not convex.
问题显然不是凸的。 I got the jacobian fo both the objective and constraint functions to work correctly (results are good/fast up to 300 input vector).
我让目标函数和约束函数的 jacobian 都能正常工作(结果很好/快到 300 个输入向量)。 All functions are vectorized and tuned to run very fast.
所有函数都经过矢量化和调整,可以非常快速地运行。 The problem is that using 1000+ input vector takes ages though I can see the minimizer is not calling my functions a lot (objective/constraint/gradients) and seems to spend most of its processing time internally.
问题是使用 1000+ 输入向量需要很长时间,尽管我可以看到最小化器并没有大量调用我的函数(目标/约束/梯度)并且似乎在内部花费了大部分处理时间。 I read somewhere perf of SLSQP is O(n^3).
我在某处读到 SLSQP 的性能是 O(n^3)。
Is there a better/faster SLSQP implementation or another method for this type of problem for python ?是否有更好/更快的 SLSQP 实现或针对 python 的此类问题的另一种方法? I tried nlopt and somehow returns wrong results given the exact same functions I use in scipy (with a wrapper to adapt to its method signature).
我尝试了 nlopt 并以某种方式返回错误的结果,因为我在 scipy 中使用了完全相同的函数(带有一个包装器以适应其方法签名)。 I also failed to use ipopt with pyipopt package, cannot get working ipopt binaries to work with the python wrapper.
我也未能将 ipopt 与 pyipopt 包一起使用,无法使 ipopt 二进制文件与 python 包装器一起使用。
UPDATE: if it helps, my input variable is basically a vector of (x,y) tuples or points in 2D surface representing coordinates.更新:如果有帮助,我的输入变量基本上是表示坐标的 2D 表面中的 (x,y) 元组或点的向量。 With 1000 points, I end up with a 2000 dim input vector.
有了 1000 个点,我最终得到了 2000 个暗淡的输入向量。 The function I want to optimize calculates optimum position of the points between each other taking into consideration their relationships and other constraints.
我想优化的函数计算彼此之间点的最佳位置,同时考虑到它们的关系和其他约束。 So the problem is not sparse.
所以问题并不稀疏。
Thanks...谢谢...
In my opinion scipy.minimze provides an intuitive interface for optimization.在我看来 scipy.minimze 提供了一个直观的优化界面。 I've found that speeding up the cost function (and eventually the gradient function) can give you nice speedups.
我发现加速成本函数(以及最终的梯度函数)可以给你带来不错的加速。
For example take the N-Dimensional Rosenbrock function:以 N 维 Rosenbrock 函数为例:
import numpy as np
from scipy.optimize import minimize
def rosenbrock(x, N):
out = 0.0
for i in range(N-1):
out += 100.0 * (x[i+1] - x[i]**2)**2 + (1 - x[i])**2
return out
# slow optimize
N = 20
x_0 = - np.ones(N)
%timeit minimize(rosenbrock, x_0, args=(N,), method='SLSQP', options={'maxiter': 1e4})
res = minimize(rosenbrock, x_0, args=(N,), method='SLSQP', options={'maxiter': 1e4})
print(res.message)
Timing of the optimization yields优化收益的时间安排
102 ms ± 1.86 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
Optimization terminated successfully.
Now you could speed up the objective function using numba , and provide a naive function to compute the gradient like this:现在您可以使用numba加速目标函数,并提供一个简单的函数来计算梯度,如下所示:
from numba import jit, float64, int64
@jit(float64(float64[:], int64), nopython=True, parallel=True)
def fast_rosenbrock(x, N):
out = 0.0
for i in range(N-1):
out += 100.0 * (x[i+1] - x[i]**2)**2 + (1 - x[i])**2
return out
@jit(float64[:](float64[:], int64), nopython=True, parallel=True)
def fast_jac(x, N):
h = 1e-9
jac = np.zeros_like(x)
f_0 = fast_rosenbrock(x, N)
for i in range(N):
x_d = np.copy(x)
x_d[i] += h
f_d = fast_rosenbrock(x_d, N)
jac[i] = (f_d - f_0) / h
return jac
Which is basically just adding a decorator to the objective function, allowing parallel computation.这基本上只是向目标函数添加一个装饰器,允许并行计算。 Now we can time the optimization again:
现在我们可以再次对优化计时:
print('with fast jacobian')
%timeit minimize(fast_rosenbrock, x_0, args=(N,), method='SLSQP', options={'maxiter': 1e4}, jac=fast_jac)
print('without fast jacobian')
%timeit minimize(fast_rosenbrock, x_0, args=(N,), method='SLSQP', options={'maxiter': 1e4})
res = minimize(fast_rosenbrock, x_0, args=(N,), method='SLSQP', options={'maxiter': 1e4}, jac=fast_jac)
print(res.message)
Trying both, with and without providing the fast jacobian function.在提供和不提供快速 jacobian 函数的情况下都尝试。 The output of this is:
这个的输出是:
with fast jacobian
9.67 ms ± 488 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
without fast jacobian
27.2 ms ± 2.4 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
Optimization terminated successfully.
Which is an about 10 times speedup for little effort.这是一个大约 10 倍的加速,只需很少的努力。 The improvement you can achieve with this is highly dependent on the inefficiency of your cost function.
您可以通过此实现的改进高度依赖于成本函数的低效率。 I had a cost function with several calculations in it and was able to get speedups around 10^2 - 10^3.
我有一个包含多项计算的成本函数,并且能够获得大约 10^2 - 10^3 的加速。
The advantage of this approach is its little effort and that you can stay with scipy and its nice interface.这种方法的优点是它的工作量很小,您可以继续使用 scipy 及其漂亮的界面。
We don't know much about the model, but here are some notes:我们对该模型知之甚少,但这里有一些注意事项:
Surprisingly, I found a relatively ok solution using an optimizer from for a deep learning framework, Tensorflow, using basic gradient descent (actually RMSProp, gradient descent with momentum) after I changed the cost function to include the inequality constraint and the bounding constraints as penalties (I suppose this is same as lagrange method).令人惊讶的是,在我更改成本函数以包含不等式约束和边界约束作为惩罚后,我找到了一个相对不错的解决方案,使用来自深度学习框架 Tensorflow 的优化器,使用基本梯度下降(实际上是 RMSProp,带动量的梯度下降) (我想这与拉格朗日方法相同)。 It trains super fast and converges quickly with proper lambda parameters on the constraint penalties.
它训练超快,并在约束惩罚上使用适当的 lambda 参数快速收敛。 I didn't even have to rewrite the jacobians as TF takes care of that without much speed impact apparently.
我什至不必重写雅可比,因为 TF 会处理这些,显然对速度没有太大影响。
Before that, I managed to get NLOPT to work and it is much faster than scipy/SLSQP but still slow on higher dimensions.在此之前,我设法让 NLOPT 工作,它比 scipy/SLSQP 快得多,但在更高维度上仍然很慢。 Also NLOPT/AUGLANG is super fast but converges poorly.
NLOPT/AUGLANG 也非常快,但收敛性很差。
This said, at 20k variables, it is stillslow.这就是说,在 20k 变量时,它仍然很慢。 Partly due to memory swapping and the cost function being at least O(n^2) from the pair-wise euclidean distance (I use (xx.t)^2+(yy.t)^2 with broadcasting).
部分是由于内存交换和成本函数与成对欧几里德距离至少为 O(n^2)(我使用 (xx.t)^2+(yy.t)^2 进行广播)。 So still not optimal.
所以仍然不是最优的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.