简体   繁体   中英

How to color certain bars in barchart of matplotlib in python?

I am looking into the selected features in hybrid feature selection which consists of embedded feature selection and wrapper selection. So, I get the features with their feature importance and then run the wrapper selection using the selected features in the embedded selection and get the features with best model accuracy.

I got the bar chart from the embedded selection and now I want to just colour bars that for features selected in wrapper selection. How can I approach this? See my following code,

############################################# Hybrid Feature Selection Methodology #####################################
                                #################### Embedded Method ########################
# perform permutation importance
results = permutation_importance(knn, X_train, y_train, scoring='accuracy')
# get importance
importance = results.importances_mean
print(importance)
# summarize feature importance
for i,v in enumerate(importance):
    print('Feature: %0d, Score: %.5f' % (i,v))
# plot feature importance
plt.bar([x for x in range(len(importance))], importance)
plt.title('Permutation Feature Importance with KNN')
plt.xlabel('Features')
plt.ylabel('Feature Importance')
plt.show()

                                #################### Wrapper Method ########################
efs = EFS(knn, min_features=1, max_features=len(X_train_knn.columns), scoring='accuracy', print_progress=True, cv=2)

# fit the object to the training data.
efs = efs.fit(X_train_knn, y_train)
print('\n')
print('Best accuracy score: ', efs.best_score_ * 100)
print('Best subset (indices):', efs.best_idx_)
print('Best subset (corresponding names):', efs.best_feature_names_)

# transform our data to the newly selected features.
optimum_number_features = list(efs.best_idx_)
optimum_number_features_knn = list(efs.best_feature_names_)

A Minimal, Reproducible Example partly means that everyone can execute your code and get result.

import matplotlib.pyplot as plt

# importance_list = list(zip(feature_name_list, results.importances_mean))
importance_list = [('quiz', 0.4080183920815765), ('time', 0.1779846287534165), ('hm', 0.1392329389521148), ('submitNum', 0.09889260035850235), ('class', 0.09379925836350246), ('post', 0.049803191453511066), ('startTime', 0.03226899003737626)]

plt.figure()

colors = ['b' for i in importance_list]

# selected_list is what your wrapper function returns
selected_list = ['quiz', 'time', 'hm']

for i, v in enumerate(importance_list):
    if v[0] in selected_list:
        colors[i] = 'r'

plt.bar([i[0] for i in importance_list], [i[1] for i in importance_list], color=colors)

plt.title('Permutation Feature Importance with KNN')
plt.xlabel('Features')
plt.ylabel('Feature Importance')

plt.show()

在此处输入图像描述

As for the zip function and importances_mean atrribute, I test it with examples from sklearn.inspection.permutation_importance .

from sklearn.linear_model import LogisticRegression
from sklearn.inspection import permutation_importance

X = [[1, 9, 9],[1, 9, 9],[1, 9, 9],
     [0, 9, 9],[0, 9, 9],[0, 9, 9]]
y = [1, 1, 1, 0, 0, 0]

clf = LogisticRegression().fit(X, y)

result = permutation_importance(clf, X, y, n_repeats=10, random_state=0)

list(zip(['a', 'b', 'c'], result.importances_mean))

# Result:
# [('a', 0.4666666666666666), ('b', 0.0), ('c', 0.0)]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM