I use SelectKBest
to select the most important features in my data set, but the length of X_new
is the same as the length of X
.
here is my simple code:
from sklearn.feature_selection import SelectKBest
from sklearn.feature_selection import chi2
from sklearn.feature_selection import mutual_info_classif
X=[[1,4,3,5],[4,5,4,5],[6,3,8,3],[6,3,10,7]]
Y=[1,1,2,2]
X_new = SelectKBest(k=2).fit_transform(X, Y)
len(X_new)
This is the desired result. SelectKBest(k=2)
is taking the best 2 features from each set in X
, based on the values provided in Y
.
After giving twice the value for the last 2 values, the SelectKBest
is picking the 2nd and 3rd elements in each set, which gives
[[ 4 3]
[ 5 4]
[ 3 8]
[ 3 10]]
This is exactly what you are supposed to get :). The resulting array is of length 4 (same as X
) since it picks from each set the top 2 elements. SelectKBest
should result in a new array Number_of_sets_in_input_array * K . In your case it is 4*2 (as X
is a 4*4 array, and you chose k=2
)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.