简体   繁体   English

确定导致训练/测试数据集中得分最高的邻居数量

[英]Identify the number of neighbors that resulted in the max score in the training/testing dataset

enter image description here在此处输入图像描述

I am working on GridSearchKNN case study and am confused on how to "Identify the number of neighbors that resulted in the max score in the training/testing dataset."我正在研究 GridSearchKNN 案例研究,并且对如何“确定导致训练/测试数据集中最高分数的邻居数量”感到困惑。 This is my first time working with KNN.这是我第一次与 KNN 合作。

I Tried:我试过了:

from sklearn.metrics import accuracy_score
b_m = knn.fit(X_train, y_train)
y_pred = knn.predict(X_test)
print(accuracy_score(y_test, y_pred))

There are some options.有一些选择。

Option 1 - Using argmax .选项 1 - 使用argmax

best_k = np.argmax(test_scores) + 1

Option 2 - Update best k in iteration.选项 2 - 在迭代中更新最佳 k。

best_score = -1
for i in range(1, 10):
    knn = KneighborsClassifier(i)
    knn.fit(X_train, y_train)
    score = knn.score(X_test, y_test)
    if score > best_score:
        best_score = score
        best_k = i

If you want to find the best k only (that is, you don't want scores of not best k), the option 2 is better.如果您只想找到最好的 k(也就是说,您不想要不是最好的 k 的分数),选项 2 更好。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM