python中KNN的数据预处理

Question

preprocessing take a lot of time-consuming to understand, tuple, list, float, array structure.预处理需要花费大量时间来理解元组、列表、浮点数、数组结构。 I have data that looks like我有看起来像的数据

<bound method NDFrame.head of                                                       X                                 Y
0     [1.9902, 1.9902, 1.9902, 1.9902, 1.9902, 0.034...      [0.097, 0.097, 0.097, 0.094]
1     [1.9902, 0.034, 0.034, 0.034, 0.034, 0.034, 0....      [0.094, 0.094, 0.094, 0.094]
2     [0.034, 0.034, 0.097, 0.097, 0.097, 0.097, 0.0...  [1.0882, 1.0882, 1.0882, 1.0882]
3     [0.097, 0.097, 0.097, 0.094, 0.094, 0.094, 0.0...  [1.0882, 1.2382, 1.2382, 1.2382]
4     [0.094, 0.094, 0.094, 0.094, 1.0882, 1.0882, 1...  [1.2382, 1.2382, 1.2182, 1.2182]
...                                                 ...                               ...
3395  [0.136, 0.286, 0.286, 0.286, 0.286, 0.286, 0.2...  [0.1276, 0.1276, 0.1276, 0.1276]
3396  [0.286, 0.286, 0.266, 0.266, 0.266, 0.266, 0.2...   [1.1423, 1.2923, 1.2723, 3.672]
3397  [0.266, 0.266, 0.266, 0.1276, 0.1276, 0.1276, ...      [3.672, 3.672, 3.772, 3.772]
3398  [0.1276, 0.1276, 0.1276, 0.1276, 1.1423, 1.292...      [3.772, 3.802, 3.802, 3.802]
3399  [1.1423, 1.2923, 1.2723, 3.672, 3.672, 3.672, ...      [1.021, 1.021, 1.021, 1.021]

I am doing data split using我正在使用

x=csv_data['X']
y=csv_data['Y']
#print(x)
x_train, x_test, y_train, y_test = train_test_split(x,y)

Fitting to KNN model拟合 KNN 模型

K = []
training = []
test = []
scores = {}
  
for k in range(2, 21):
    clf = KNeighborsClassifier(n_neighbors = k)
    clf.fit(x_train, y_train)
  
    training_score = clf.score(x_train, y_train)
    test_score = clf.score(x_test, y_test)
    K.append(k)
  
    training.append(training_score)
    test.append(test_score)
    scores[k] = [training_score, test_score]

Getting error获取错误

TypeError                                 Traceback (most recent call last)
TypeError: float() argument must be a string or a number, not 'list'

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
<ipython-input-93-906aa771beda> in <module>()
      6 for k in range(2, 21):
      7     clf = KNeighborsClassifier(n_neighbors = k)
----> 8     clf.fit(x_train, y_train)
      9 
     10     training_score = clf.score(x_train, y_train)

7 frames
/usr/local/lib/python3.7/dist-packages/numpy/core/_asarray.py in asarray(a, dtype, order)
     81 
     82     """
---> 83     return array(a, dtype, copy=False, order=order)
     84 
     85 

ValueError: setting an array element with a sequence.

I have been trying few methods such as preprocessing or StandardScaler dint work for me.我一直在尝试一些方法，例如preprocessing或StandardScaler对我有用。 Kindly help in running KNN.请帮助运行 KNN。 Thanks谢谢

Answer 1

The problem is that while using KNN your y is of the shape (n, 4) while the KNN.fit method wants your y to be of shape (n,1) .问题是，在使用KNN您的y的形状为(n, 4)而KNN.fit方法希望您的y的形状为(n,1) 。 So in short you can only predict 1 value from y .所以简而言之，您只能从y预测 1 个值。 So in short you either use KNN 4 times for each column in y or don't use KNN .所以简而言之，您要么对y每一列使用KNN 4 次，要么不使用KNN 。

The code will be like this代码将是这样的

# Import KNN for regression

y1 = y.iloc[:, 0]
y2 = y.iloc[:, 1]
y3 = y.iloc[:, 2]
y4 = y.iloc[:, 3]

regressor1 = KNeighborsRegressor(n_neighbors=k).fit(x, y1)
regressor2 = KNeighborsRegressor(n_neighbors=k).fit(x, y2)
regressor3 = KNeighborsRegressor(n_neighbors=k).fit(x, y3)
regressor4 = KNeighborsRegressor(n_neighbors=k).fit(x, y4)

OMG!!我的天啊！！ Now that I see you were using KNN for classification where in fact your problem is regression.现在我看到您使用KNN进行分类，而实际上您的问题是回归。 Your fundamentals are really really poor.你的基础真的很差。

Also, Just don't even use that.另外，只是不要使用它。 You won't get any good results from it and it's also computationally expensive.你不会从中得到任何好的结果，而且它的计算成本也很高。

python中KNN的数据预处理

问题描述

1 个解决方案

解决方案1
0 2021-07-26 03:52:25

python中KNN的数据预处理

问题描述

1 个解决方案

解决方案1 0 2021-07-26 03:52:25

解决方案1
0 2021-07-26 03:52:25