简体   繁体   中英

Extracting feature importances along with column names from sklearn pipeline

I have a sklearn pipeline with two steps (a columntransformer preprocessor with a One hot encoder and a randomforestregressor estimator). I would like to get the feature names of the encoded columns after One hot encoding. My pipeline looks like this.

categorical_preprocessor = OneHotEncoder(handle_unknown="ignore")

# Model processor
preprocessor = ColumnTransformer(
    [('categorical', categorical_preprocessor, categorical_columns)], remainder="passthrough")

est = RandomForestRegressor(
n_estimators=100, random_state=0)

pipe = make_pipeline(preprocessor,est)

I am trying to get the feature names of the encoded columns like this:

pipe['preprocessor'].transformers[0][0].get_feature_names(categorical_columns)

But I get an error.

'str' object has no attribute 'get_feature_names'

There is apparantly a new feature from scikit-learn 1.0 where we extract the feature names as:

pipeline[:-1].get_feature_names_out()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM