简体   繁体   中英

How to get the scores of each feature from sklearn.feature_selection.SelectKBest?

I am trying to get the scores of all the features of my data set.

file_data = numpy.genfromtxt(input_file)
y = file_data[:,-1]
X = file_data[:,0:-1]

x_new = SelectKBest(chi2, k='all').fit_transform(X,y)

Before the first row of X had the "Feature names" in string format but I was getting "Input contains NaN, infinity or a value too large for dtype('float64')" error. So, now X contains only the data and y contains the target values(1,-1).

How can I get the score of each feature from SelectKBest(trying to use Uni-variate feature selection)?

thanks

Solution

You just have to do something like this.

file_data = numpy.genfromtxt(input_file)
y = file_data[:,-1]
X = file_data[:,0:-1]

selector = SelectKBest(chi2, k='all').fit(X,y)
x_new = selector.transform(X) # not needed to get the score
scores = selector.scores_


Your problem

When you use directly .fit_transform(features, target) , the selector is not stored and you are returning the selected features . However, the scores is an attribute of the selector . In order to get it, you have to use .fit(features, target) . Once you have your selector fitted, you can get the selected features by calling selector.transform(features) , as you can see in the code avobe.

As I commented in the code, you don't need to have transformed the features to get the score. Just with fitting them is enough.


Links

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM