[英]How to print a simple list of feature importance in when using Logistic Regression?
I am using the dataset found here: https://www.kaggle.com/pavansubhasht/ibm-hr-analytics-attrition-dataset我正在使用此处找到的数据集: https : //www.kaggle.com/pavansubhasht/ibm-hr-analytics-attrition-dataset
My code is:我的代码是:
from sklearn.feature_selection import RFE
from sklearn.linear_model import LogisticRegression
log_reg_model = LogisticRegression(max_iter=1000, solver = "newton-cg")
log_reg_model = RFE(log_reg_model, 45) # using RFE to get the top 45 most important features
log_reg_model.fit(X_train_SMOTE, y_train_SMOTE) # fitting data
y_pred = log_reg_model.predict(X_test)
print("Model accruracy score: {}".format(accuracy_score(y_test, y_pred)))
print(classification_report(y_test, y_pred))
I am trying to print out the most most important features in order like when using the feature_importances_ function in Random Forest Classification.我试图按顺序打印出最重要的特征,就像在随机森林分类中使用 feature_importances_ 函数一样。
Is the above possible using LR?以上可以使用LR吗? I see similar questions on Stack Overflow but no answers that show the feature names and their importance.我在 Stack Overflow 上看到类似的问题,但没有显示功能名称及其重要性的答案。
To do this, you can use a method called shap
, I definitely would recommend reading about SHAP before diving right into the code , as its going to be important for you and others to understand exactly what you are presenting.要做到这一点,你可以使用一个方法叫做shap
, 我肯定会推荐阅读SHAP跳水对入代码之前,因为它的将是重要的,你和别人理解你提出什么。
However, an example of how that could work in your implementation is:但是,在您的实现中如何工作的一个例子是:
from sklearn.feature_selection import RFE
from sklearn.linear_model import LogisticRegression
import shap
log_reg_model = LogisticRegression(max_iter=1000, solver = "newton-cg")
# log_reg_model = RFE(log_reg_model, 45) # using RFE to get the top 45 most important features
log_reg_model.fit(X_train_SMOTE, y_train_SMOTE) # fitting data
y_pred = log_reg_model.predict(X_test)
print("Model accruracy score: {}".format(accuracy_score(y_test, y_pred)))
print(classification_report(y_test, y_pred))
explainer = shap.LinearExplainer(log_reg_model, X_train_SMOTE)
shap_values = explainer.shap_values(X_test[:150])
shap.summary_plot(shap_values, feature_names = X_train_SMOTE.columns)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.