简体   繁体   中英

Python, SelectKBest doesn't work

I use SelectKBest to select the most important features in my data set, but the length of X_new is the same as the length of X .

here is my simple code:

from sklearn.feature_selection import SelectKBest
from sklearn.feature_selection import chi2
from sklearn.feature_selection import mutual_info_classif
X=[[1,4,3,5],[4,5,4,5],[6,3,8,3],[6,3,10,7]]
Y=[1,1,2,2]
X_new = SelectKBest(k=2).fit_transform(X, Y)
len(X_new)

This is the desired result. SelectKBest(k=2) is taking the best 2 features from each set in X , based on the values provided in Y .

After giving twice the value for the last 2 values, the SelectKBest is picking the 2nd and 3rd elements in each set, which gives

[[ 4  3]
 [ 5  4]
 [ 3  8]
 [ 3 10]]

This is exactly what you are supposed to get :). The resulting array is of length 4 (same as X ) since it picks from each set the top 2 elements. SelectKBest should result in a new array Number_of_sets_in_input_array * K . In your case it is 4*2 (as X is a 4*4 array, and you chose k=2 )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM