为什么 RandomizedSearchCV 返回线性估计器的度数或伽马值？

Question

With Sklearn I am using RandomizedSearchCV, in a specific case the best estimator is:对于 Sklearn，我使用的是 RandomizedSearchCV，在特定情况下，最好的估计器是：

SVR(C=1594.0828461797396, degree=0.8284528822863231, gamma=1.1891370222133257,kernel='linear')

But according to sklearn documentation , degree and gamma is just for rbf and poly kernels.但根据sklearn 文档， degree和gamma仅适用于rbf和poly内核。 Why I get linar estimator with gamma and degree values?为什么我得到带有gamma和degree值的linar估计器？

from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import expon, reciprocal

param_distribs = {
        'kernel': ['linear', 'rbf','poly','sigmoid'],
        'C': reciprocal(20, 200000),
        'gamma': expon(scale=1.0),
        'degree': expon(scale=1.0),
    }

svm_reg = SVR()
rnd_search = RandomizedSearchCV(svm_reg, param_distributions=param_distribs,
                                n_iter=50, cv=5, scoring='neg_mean_squared_error',
                                verbose=2, random_state=42)
rnd_search.fit(X, y)

Answer 1

RandomizedSearchCV will always randomly set all specified parameters for the estimator irrespective of such restrictions as there is no internal method implemented to check which combinations make sense for a particular estimator or not. RandomizedSearchCV将始终为估计器随机设置所有指定参数，而不管这些限制如何，因为没有实施内部方法来检查哪些组合对特定估计器有意义。 Since gamma and degree are just ignored in combination with a linear kernel, it will also not raise an error and the algorithm just runs with all parameters set each time.由于gamma和degree在与linear kernel 结合使用时会被忽略，因此它也不会引发错误，并且算法只是在每次设置所有参数的情况下运行。

If you want to avoid such behavior, you can pass the parameter grid as a list of dictionaries specifying which combinations are allowed.如果您想避免这种行为，您可以将参数 grid 作为字典列表传递，指定允许哪些组合。 The documentation specifies for such cases:该文档针对此类情况指定：

If a list of dicts is given, first a dict is sampled uniformly, and then a parameter is sampled using that dict as above.如果给定一个字典列表，首先对字典进行均匀采样，然后使用该字典对参数进行采样，如上所述。

So for example, let's say you defined the following as the parameter grid:例如，假设您将以下内容定义为参数网格：

param_distribs = [
    {
        'kernel': ['rbf','poly'],
        'C': reciprocal(20, 200000),
        'gamma': expon(scale=1.0),
        'degree': expon(scale=1.0)
    },
    {
        'kernel': ['linear','sigmoid'],
        'C': reciprocal(20, 200000)
    }
]

This would avoid RandomizedSearchCV to set gamma and degree when it chooses the dictionary with linear kernel in an iteration.这将避免RandomizedSearchCV在迭代中选择具有linear kernel 的字典时设置gamma和degree 。 On the contrary, if it chooses the other dictionary in a particular iteration, it will set gamma and degree as well.相反，如果它在特定迭代中选择另一个字典，它也会设置gamma和degree 。

为什么 RandomizedSearchCV 返回线性估计器的度数或伽马值？

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-07-08 20:52:20

为什么 RandomizedSearchCV 返回线性估计器的度数或伽马值？

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-07-08 20:52:20

解决方案1
1 已采纳 2020-07-08 20:52:20