使用 gridsearchcv 进行逻辑回归的特征重要性

Question

I've trained a logistic regression model like this:我已经像这样训练了逻辑回归 model：

reg = LogisticRegression(random_state = 40)
cvreg = GridSearchCV(reg, param_grid={'C':[0.05,0.1,0.5],
                                      'penalty':['none','l1','l2'],
                                      'solver':['saga']},
                     cv = 5)
cvreg.fit(X_train, y_train)

Now to show the feature's importance I've tried this code, but I don't get the names of the coefficients in the plot:现在为了显示该功能的重要性，我尝试了这段代码，但我没有得到 plot 中的系数名称：

from matplotlib import pyplot

importance = cvreg.best_estimator_.coef_[0]
pyplot.bar([x for x in range(len(importance))], importance)
pyplot.show()

Obviously, the plot isn't very informative.显然，plot 提供的信息不多。 How do I add the names of the coefficients to the x-axis?如何将系数的名称添加到 x 轴？

The importance of the coeff is: coeff 的重要性在于：

cvreg.best_estimator_.coef_
array([[1.10303023e+00, 7.48816905e-01, 4.27705027e-04, 6.01404570e-01]])

Answer 1

The coefficients correspond to the columns of X_train , so pass in the X_train names instead of range(len(importance)) .系数对应于X_train的列，因此传入X_train名称而不是range(len(importance)) 。

Assuming X_train is a pandas dataframe:假设X_train是 pandas dataframe：

import matplotlib.pyplot as plt

features = X_train.columns
importance = cvreg.best_estimator_.coef_[0]

plt.bar(features, importance)
plt.show()

Note that if X_train is just a numpy array without column names, you will have to define the features list based on your own data dictionary.请注意，如果X_train只是一个没有列名的 numpy 数组，您将必须根据自己的数据字典定义features列表。

使用 gridsearchcv 进行逻辑回归的特征重要性

问题描述

1 个解决方案

解决方案1
1 已采纳 2023-01-01 01:57:20

使用 gridsearchcv 进行逻辑回归的特征重要性

问题描述

1 个解决方案

解决方案1 1 已采纳 2023-01-01 01:57:20

解决方案1
1 已采纳 2023-01-01 01:57:20