简体   繁体   中英

Can I use string values as my dependent variable in KNN machine learning model?

So, I have the data with 128 face encodings with label being the name of the person and the column with names on it is my target column. I obviously used labelbinarisation to binarise the dependent variable(name column). When I used KNN to fit and to predict the name of the person, it doesn't predict anything.

It should be something like this:

在此处输入图片说明

But instead I got this:

在此处输入图片说明

Because of this my first doubt was whether I can use strings value as my dependent or target variable or not. Any help is appreciated. Thank you

For binarisation,I have used this

#Binarising the labels
labelBinarised = LabelBinarizer()
Y_train=labelBinarised.fit_transform(Y_train)
Y_test = labelBinarised.fit_transform(Y_test)

You can use string values as you target variable, as documentation says target variable should be {array-like, sparse matrix} Target values of shape = [n_samples] or [n_samples, n_outputs] , they did not mention it to be numeric only. your features needs to be numeric as it is used to calculate distance, but your target can be string.

As in below example the target value is string and it works fine:

X = [[0], [1], [2], [3]]
y = ['zero', 'zero', 'one', 'one']

from sklearn.neighbors import KNeighborsClassifier

neigh = KNeighborsClassifier(n_neighbors=3)
neigh.fit(X, y)

print(neigh.predict([[3]]))

#output
#array(['one'], dtype='<U4')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM