简体   繁体   English

在 Python 中使用 Keras 的神经网络中的特征重要性图表

[英]Feature Importance Chart in neural network using Keras in Python

I am using python(3.6) anaconda (64 bit) spyder (3.1.2).我正在使用 python(3.6) anaconda(64 位)spyder (3.1.2)。 I already set a neural network model using keras (2.0.6) for a regression problem(one response, 10 variables).我已经使用 keras (2.0.6) 为回归问题(一个响应,10 个变量)设置了一个神经网络模型。 I was wondering how can I generate feature importance chart like so:我想知道如何生成像这样的特征重要性图表:

特征重要性图表

def base_model():
    model = Sequential()
    model.add(Dense(200, input_dim=10, kernel_initializer='normal', activation='relu'))
    model.add(Dense(1, kernel_initializer='normal'))
    model.compile(loss='mean_squared_error', optimizer = 'adam')
    return model

clf = KerasRegressor(build_fn=base_model, epochs=100, batch_size=5,verbose=0)
clf.fit(X_train,Y_train)

I was recently looking for the answer to this question and found something that was useful for what I was doing and thought it would be helpful to share.我最近在寻找这个问题的答案,发现了一些对我正在做的事情有用的东西,并认为它会有所帮助。 I ended up using a permutation importance module from the eli5 package .我最终使用了eli5 包中的排列重要性模块。 It most easily works with a scikit-learn model.它最容易与 scikit-learn 模型一起使用。 Luckily, Keras provides a wrapper for sequential models .幸运的是,Keras为顺序模型提供了一个包装器 As shown in the code below, using it is very straightforward.如下代码所示,使用起来非常简单。

from keras.wrappers.scikit_learn import KerasClassifier, KerasRegressor
import eli5
from eli5.sklearn import PermutationImportance

def base_model():
    model = Sequential()        
    ...
    return model

X = ...
y = ...

my_model = KerasRegressor(build_fn=base_model, **sk_params)    
my_model.fit(X,y)

perm = PermutationImportance(my_model, random_state=1).fit(X,y)
eli5.show_weights(perm, feature_names = X.columns.tolist())

This is a relatively old post with relatively old answers, so I would like to offer another suggestion of using SHAP to determine feature importance for your Keras models.这是一篇相对较旧的帖子,答案相对较旧,因此我想提供另一个建议,即使用SHAP确定 Keras 模型的特征重要性。 SHAP offers support for both 2d and 3d arrays compared to eli5 which currently only supports 2d arrays (so if your model uses layers which require 3d input like LSTM or GRU , eli5 will not work). SHAP提供对 2d 和 3d 数组的支持,而eli5目前仅支持 2d 数组(因此,如果您的模型使用需要 3d 输入的层,如LSTMGRUeli5将不起作用)。

Here is the link to an example of how SHAP can plot the feature importance for your Keras models, but in case it ever becomes broken some sample code and plots are provided below as well (taken from said link):以下是SHAP如何绘制Keras模型的特征重要性示例的链接,但以防它被破坏,下面还提供了一些示例代码和图表(取自上述链接):


import shap

# load your data here, e.g. X and y
# create and fit your model here

# load JS visualization code to notebook
shap.initjs()

# explain the model's predictions using SHAP
# (same syntax works for LightGBM, CatBoost, scikit-learn and spark models)
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X)

# visualize the first prediction's explanation (use matplotlib=True to avoid Javascript)
shap.force_plot(explainer.expected_value, shap_values[0,:], X.iloc[0,:])

shap.summary_plot(shap_values, X, plot_type="bar")

在此处输入图像描述

At the moment Keras doesn't provide any functionality to extract the feature importance.目前 Keras 不提供任何功能来提取特征重要性。

You can check this previous question: Keras: Any way to get variable importance?您可以查看之前的问题: Keras:有什么方法可以获得可变的重要性?

or the related GoogleGroup:Feature importance或相关的 GoogleGroup:特征重要性

Spoiler: In the GoogleGroup someone announced an open source project to solve this issue..剧透:在 GoogleGroup 中,有人宣布了一个开源项目来解决这个问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM