Can I use string values as my dependent variable in KNN machine learning model?

Question

So, I have the data with 128 face encodings with label being the name of the person and the column with names on it is my target column. I obviously used labelbinarisation to binarise the dependent variable(name column). When I used KNN to fit and to predict the name of the person, it doesn't predict anything.

It should be something like this:

But instead I got this:

Because of this my first doubt was whether I can use strings value as my dependent or target variable or not. Any help is appreciated. Thank you

For binarisation,I have used this

#Binarising the labels
labelBinarised = LabelBinarizer()
Y_train=labelBinarised.fit_transform(Y_train)
Y_test = labelBinarised.fit_transform(Y_test)

Answer 1

You can use string values as you target variable, as documentation says target variable should be {array-like, sparse matrix} Target values of shape = [n_samples] or [n_samples, n_outputs] , they did not mention it to be numeric only. your features needs to be numeric as it is used to calculate distance, but your target can be string.

As in below example the target value is string and it works fine:

X = [[0], [1], [2], [3]]
y = ['zero', 'zero', 'one', 'one']

from sklearn.neighbors import KNeighborsClassifier

neigh = KNeighborsClassifier(n_neighbors=3)
neigh.fit(X, y)

print(neigh.predict([[3]]))

#output
#array(['one'], dtype='<U4')

Can I use string values as my dependent variable in KNN machine learning model?

Question

1 answers

solution1
1 ACCPTED 2020-08-26 03:54:46

Can I use string values as my dependent variable in KNN machine learning model?

Question

1 answers

solution1 1 ACCPTED 2020-08-26 03:54:46

solution1
1 ACCPTED 2020-08-26 03:54:46