简体   繁体   English

当使用来自 sklearn.neighbors.KNeighborsClassifier 的 predict 和 kneighbors 时,KNN 分类器给出不同的结果

[英]KNN classifier gives different results when using predict and kneighbors from sklearn.neighbors.KNeighborsClassifier

I want to classify the extracted features from a CNN with k-nearest neighbors classifier from sklearn.neighbors.KNeighborsClassifier.我想使用来自 sklearn.neighbors.KNeighborsClassifier 的 k 最近邻分类器对从 CNN 提取的特征进行分类。 But when I used predict() function on test data it gives a class different than the majority votes that can be found by kneighbors().但是当我在测试数据上使用 predict() 函数时,它给出的类与 kneighbors() 可以找到的多数票不同。 I am using the following Resnet50 pretrained model to extract the features which is a branch of a siamese network.我正在使用以下 Resnet50 预训练模型来提取作为连体网络分支的特征。 Details of the siamese network can be found here .可以在此处找到连体网络的详细信息。

def embedding_model():
    
    baseModel = ResNet50(weights="imagenet", include_top=False,input_tensor=Input(shape=(IMAGE_SIZE, IMAGE_SIZE, 3)))
    for layer in baseModel.layers[:165]:
        layer.trainable = False
    
    headModel = baseModel.output
    headModel = GlobalAveragePooling2D()(headModel)
    model = Model(inputs=baseModel.input, outputs=headModel, name = 'embedding_model')

    return model

#get embedding model weights from saved weights
embeddings_weights = siamese_test.get_layer('embedding_model').get_weights()
embeddings_branch = siamese_test.get_layer('embedding_model')

input_shape = (224,224,3)

input = Input(shape=input_shape)
x = embeddings_branch(input)

model = Model(input, x)
model.set_weights(embeddings_weights )
out_shape = model.layers[-1].output_shape

Model summary can be found here .模型摘要可以在这里找到。 I used the following function to extract the features using the model.我使用以下函数来使用模型提取特征。

def create_features(dataset, pre_model,out_shape,batchSize=16):
    features = pre_model.predict(dataset, batchSize)
    features_flatten = features.reshape((features.shape[0], out_shape[1] ))
    return features, features_flatten
train_features, train_features_flatten = create_features(x_train,model,out_shape, batchSize)
test_features, test_features_flatten = create_features(x_test,model,out_shape, batchSize)

Then I used KNN classifier to predict on test features然后我使用 KNN 分类器来预测测试特征

from sklearn.neighbors import KNeighborsClassifier

KNN_classifier = KNeighborsClassifier(n_neighbors=3)
KNN_classifier.fit(train_features_flatten, y_train)

y_pred = KNN_classifier.predict(test_features_flatten)

I used keighbors() function to find the nearest neighbors distance and their corresponding index.我使用 keighbors() 函数来查找最近邻居的距离及其相应的索引。 But it gives me different results than the predicted one.但它给了我与预期不同的结果。

neighbors_dist, neighbors_index = KNN_classifier.kneighbors(test_features_flatten)

#replace the index with actual class 
data2 = np.zeros(neighbors_index.shape, dtype=object)
for i in range(neighbors_index.shape[0]):
  for j in range(neighbors_index.shape[1]):
    data2[i,j] = str(y_test[neighbors_index[i][j]])

#get the majority class 
from collections import Counter
majority_class = np.array([Counter(sorted(row, reverse=True)).most_common(1)[0][0] for row in data2])

As you can see the predicted class is not same as the majority class for first 10 samples如您所见,预测的类与前 10 个样本的多数类不同

for i, pred in enumerate(y_pred):
  print(i,pred)

for i, c in enumerate(majority_class):
  print(i,c)

Predicted output for first 10 samples: 0 corduroy 1 wool 2 wool 3 brown_bread 4 wood 5 corduroy 6 corduroy 7 corduroy 8 wool 9 wood 10 corduroy前 10 个样品的预测输出: 0 灯芯绒 1 羊毛 2 羊毛 3 brown_bread 4 木材 5 灯芯绒 6 灯芯绒 7 灯芯绒 8 羊毛 9 木材 10 灯芯绒

Majority class for first 10 samples: 0 corduroy 1 cork 2 cork 3 lettuce_leaf 4 linen 5 corduroy 6 wool 7 corduroy 8 brown_bread 9 linen 10 wool前 10 个样品的多数类: 0 灯芯绒 1 软木 2 软木 3 lettuce_leaf 4 亚麻 5 灯芯绒 6 羊毛 7 灯芯绒 8 brown_bread 9 亚麻 10 羊毛

Is there anything I am doing wrong ?有什么我做错了吗? Any help would be appreciated.任何帮助,将不胜感激。 Thank you.谢谢你。

This is incorrect:这是不正确的:

    data2[i,j] = str(y_test[neighbors_index[i][j]])

The kneighbors method (and also predict ) finds the nearest training points to the inputs, so you should reference y_train here. kneighbors方法(以及predict )找到最接近输入的训练点,因此您应该在此处引用y_train

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Sklearn KNeighborsClassifier 在使用 KNeighborsTransformer 时给出不同的结果? - Sklearn KNeighborsClassifier gives different results when using KNeighborsTransformer? sklearn.neighbors.KNeighborsClassifier 无法将字符串转换为浮点数 - sklearn.neighbors.KNeighborsClassifier could not convert string to float 是否可以将 sklearn.neighbors.KNeighborsClassifier 用于 tensorflow Session 即与张量? - Is possible to use sklearn.neighbors.KNeighborsClassifier into a tensorflow Session i.e with Tensor? 使用 Python scikit sklearn 为最近邻 (knn) 分类器调用预测函数 - Call predict function for nearest neighbor (knn) classifier with Python scikit sklearn KNeighborsClassifier 未读取 knn__n_neighbors - KNeighborsClassifier not reading in knn__n_neighbors sklearn 线性回归中的 Function 方程/使用系数计算得出的结果与 model.predict(x) 不同 - Function equation from sklearn linear regression / Calculating with coefficients gives different results than model.predict(x) PYOD 和 Sklearn 包的 KNN 算法结果不同的原因 - The reason of different results of KNN algorithm from PYOD & Sklearn packages Sklearn 中的 fit 方法。 使用 KNeighborsClassifier 时 - fit method in Sklearn. when using KNeighborsClassifier 使用 sklearn KNN 显示最近的邻居 - Show nearest neighbors with sklearn KNN 使用sklearn套袋分类器预测连续值 - predict continuous values using sklearn bagging classifier
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM