简体   繁体   中英

Random Forest feature importance per value of a column in Python

I currently have a dataset with loads of neighbourhoods (samples). There is also one column called 'municipality' which has the name of the municipality to which the neighbourhood belongs. I made a random forest regressor to predict energy consumption in the Netherlands based on many features (of course the column 'municipality' was not used as a feature and it is not a class).

Sklearn has a feature importance function, but this is for the whole training dataset. I was wondering if it is possible to see per municipality which features were most important there to training the model. I want to see if I can find any spatial differences between feature importances of various municipalities.

First I thought, maybe I can see for each sample in the training data which features were most important and then sum up all the samples (neighbourhoods) from the same municipality. But I can't find anything like this on google.

Hope someone can help.

Thanks!

Feature importance is obtained for the trained model. You can't ask for importance feature depending on one column feature, since it will automatically use all the features of the model trained.

One idea is train one model for each class of neighborhoods. Then you will have the list of feature importance for each class and compare it. Of course, you can only do that if you have a relatively small number of different classes.

1st: Separate your data based on what feature you want for example:

data1 = data[np.where(municipality==a)]

data2 = data[np.where(municipality==b)]

now train your data and see the importance based on whatever municipality is. and to compare importance just look at the clf.feature_importance result and compare them.

for better answers write better questions

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM