简体   繁体   English

Python:没有机器学习的网格搜索?

[英]Python: Gridsearch Without Machine Learning?

I want to optimize an algorithm that has several variable parameters as input.我想优化一个有多个可变参数作为输入的算法。

For machine learning tasks, Sklearn offers the optimization of hyperparameters with the gridsearch functionality.机器学习任务, Sklearn提供了与超参数的优化gridsearch功能。

Is there a standardized way / library in Python that allows the optimization of hyperparameters that is not limited to machine learning topics? Python 中是否有标准化的方式/库可以优化不限于机器学习主题的超参数?

You can create a custom pipeline/estimator ( see link http://scikit-learn.org/dev/developers/contributing.html#rolling-your-own-estimator ) with a score method to compare the results.您可以使用评分方法创建自定义管道/估算器(请参阅链接http://scikit-learn.org/dev/developers/contributing.html#rolling-your-own-estimator )来比较结果。

The ParameterGrid might help you too. ParameterGrid也可能对您有所帮助。 It will automatically populated all the hyper-parameters settings.它将自动填充所有超参数设置。

You might consider scipy's optimize.brute , which essentially is the same, although not that constrained in terms of API-usage.您可能会考虑 scipy 的optimize.brute ,它本质上是相同的,尽管在 API 使用方面不受限制。 You will just have to define a function, which returns a scalar.你只需要定义一个函数,它返回一个标量。

Minimize a function over a given range by brute force.通过蛮力在给定范围内最小化函数。

Uses the “brute force” method, ie computes the function's value at each point of a multidimensional grid of points, to find the global minimum of the function.使用“蛮力”方法,即计算多维点网格中每个点的函数值,以找到函数的全局最小值。

Shameless example-copy from the docs:来自文档的无耻示例副本:

Code代码

import numpy as np
from scipy import optimize


params = (2, 3, 7, 8, 9, 10, 44, -1, 2, 26, 1, -2, 0.5)
def f1(z, *params):
    x, y = z
    a, b, c, d, e, f, g, h, i, j, k, l, scale = params
    return (a * x**2 + b * x * y + c * y**2 + d*x + e*y + f)

def f2(z, *params):
    x, y = z
    a, b, c, d, e, f, g, h, i, j, k, l, scale = params
    return (-g*np.exp(-((x-h)**2 + (y-i)**2) / scale))


def f3(z, *params):
    x, y = z
    a, b, c, d, e, f, g, h, i, j, k, l, scale = params
    return (-j*np.exp(-((x-k)**2 + (y-l)**2) / scale))


def f(z, *params):
    return f1(z, *params) + f2(z, *params) + f3(z, *params)

rranges = (slice(-4, 4, 0.25), slice(-4, 4, 0.25))
resbrute = optimize.brute(f, rranges, args=params, full_output=True,
                          finish=optimize.fmin)
print(resbrute[:2])  # x0, feval

Out

(array([-1.05665192,  1.80834843]), -3.4085818767996527)

Brute-force functions are not much black-magic and often one might consider an own implementation.蛮力函数并不是什么黑魔法,通常人们可能会考虑自己的实现。 The scipy-example above has one more interesting feature :上面的 scipy-example 有一个更有趣的功能

finish : callable, optional完成:可调用,可选

An optimization function that is called with the result of brute force minimization as initial guess.以蛮力最小化的结果作为初始猜测调用的优化函数。 finish should take func and the initial guess as positional arguments, and take args as keyword arguments.完成应该将 func 和初始猜测作为位置参数,并将 args 作为关键字参数。 It may additionally take full_output and/or disp as keyword arguments.它还可以将 full_output 和/或 disp 作为关键字参数。 Use None if no “polishing” function is to be used.如果不使用“抛光”功能,请使用 None。 See Notes for more details.有关更多详细信息,请参阅注释。

which i would recommend for most use-cases (in continuous-space).我会推荐用于大多数用例(在连续空间中)。 But be sure to get some minimal understanding what this is doing to understand there are use-cases where you don't want to do this (discrete-space results needed; slow function-evaluation).但是一定要稍微了解一下这是做什么的,以了解您不想这样做的用例(需要离散空间结果;缓慢的函数评估)。

If you are using sklearn, you already have scipy installed (it's a dependency).如果您使用的是 sklearn,则您已经安装了 scipy(它是一个依赖项)。

Edit: here some small plot i created ( code ) to show what finish is doing (local-opt) with an 1d-example (not the best example, but easier to plot):编辑:这里我创建了一些小图( 代码),以显示使用一维示例(不是最好的示例,但更易于绘制) finish操作(本地选择):

在此处输入图片说明

Sklearn can also be used independent of machine learning topics, hence and for the sake of completeness, Sklearn 也可以独立于机器学习主题使用,因此为了完整性,

I propose:我提议:

from sklearn.model_selection import ParameterGrid
param_grid = {'value_1': [1, 2, 3], 'value_2': [0, 1, 2, 3, 5]}
for params in ParameterGrid(param_grid):
    function(params['value_1'], params['value_2'])

Find detailed documentation here . 在此处查找详细文档。

You can have a look also at Bayesian Optimization.您还可以查看贝叶斯优化。 In this github repository you can find the easy implementation.在这个github 存储库中,您可以找到简单的实现。

The difference is that Bayesian Optimization doesn't look into specific values that you input range, but it looks for values within the range.不同之处在于贝叶斯优化不会查看您输入范围内的特定值,而是查找范围内的值。

The example below is taken from their repository, so that you see how easy is the implementation!下面的示例取自他们的存储库,以便您了解实现是多么容易!

def black_box_function(x, y):
    """Function with unknown internals we wish to maximize.

    This is just serving as an example, for all intents and
    purposes think of the internals of this function, i.e.: the process
    which generates its output values, as unknown.
    """
    return -x ** 2 - (y - 1) ** 2 + 1

from bayes_opt import BayesianOptimization

# Bounded region of parameter space
pbounds = {'x': (2, 4), 'y': (-3, 3)}

optimizer = BayesianOptimization(
    f=black_box_function,
    pbounds=pbounds,
    random_state=1,
)

optimizer.maximize(
    init_points=2,
    n_iter=3,
)

print(optimizer.max)
>>> {'target': -4.441293113411222, 'params': {'y': -0.005822117636089974, 'x': 2.104665051994087}}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM