简体   繁体   English

X 和 Y 矩阵的尺寸不兼容

[英]Incompatible dimension for X and Y matrices

I was wondering what i have wrong here i get the error我想知道我有什么问题我得到了错误

Traceback (most recent call last):
  File "main.py", line 37, in <module>
    y_pred = knn.predict(X_test)
  File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/sklearn/neighbors/classification.py", line149, in predict
    neigh_dist, neigh_ind = self.kneighbors(X)
  File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/sklearn/neighbors/base.py", line 434, in kneighbors
    **kwds))
  File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/sklearn/metrics/pairwise.py", line 1448, in pairwise_distances_chunked
    n_jobs=n_jobs, **kwds)
  File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/sklearn/metrics/pairwise.py", line 1588, in pairwise_distances
    return _parallel_pairwise(X, Y, func, n_jobs, **kwds)
  File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/sklearn/metrics/pairwise.py", line 1206, in _parallel_pairwise
    return func(X, Y, **kwds)
  File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/sklearn/metrics/pairwise.py", line 232, ineuclidean_distances
    X, Y = check_pairwise_arrays(X, Y)
  File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/sklearn/metrics/pairwise.py", line 125, incheck_pairwise_arrays
    X.shape[1], Y.shape[1]))
ValueError: Incompatible dimension for X and Y matrices: X.shape[1] == 38 while Y.shape[1] == 43

I'm new to ai and cant find anything on the internet that really solves this problem, any comment appreciated.我是 ai 新手,在互联网上找不到真正解决这个问题的任何东西,任何评论表示赞赏。 This is my code这是我的代码

from sklearn.preprocessing import OneHotEncoder
from sklearn import metrics 
from sklearn.neighbors import KNeighborsClassifier 
from sklearn.model_selection import train_test_split
import pandas as pd

fileName = "breast-cancer-fixed.csv";

df = pd.read_csv(fileName)

X = df[df.columns[:-1]] 
y = df[df.columns[-1]]  

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, random_state=1) 

X_train = OneHotEncoder().fit_transform(X_train)
X_test = OneHotEncoder().fit_transform(X_test)

knn = KNeighborsClassifier(n_neighbors=3) 
knn.fit(X_train, y_train) 

y_pred = knn.predict(X_test)
print("kNN model accuracy:", metrics.accuracy_score(y_test, y_pred)) 

My csv is massive and i cant upload it here so i put a small snippet in我的 csv 很大,我不能在这里上传,所以我放了一个小片段

age,menopause,tumor-size,inv-nodes,node-caps,deg-malig,breast,breast-quad,irradiat,Class
40-49,premeno,15-19,0-2,yes,3,right,left_up,no,recurrence-events
50-59,ge40,15-19,0-2,no,1,right,central,no,no-recurrence-events
50-59,ge40,35-39,0-2,no,2,left,left_low,no,recurrence-events
40-49,premeno,35-39,0-2,yes,3,right,left_low,yes,no-recurrence-events
40-49,premeno,30-34,3-5,yes,2,left,right_up,no,recurrence-events
50-59,premeno,25-29,3-5,no,2,right,left_up,yes,no-recurrence-events
50-59,ge40,40-44,0-2,no,3,left,left_up,no,no-recurrence-events
40-49,premeno,10-14,0-2,no,2,left,left_up,no,no-recurrence-events
40-49,premeno,0-4,0-2,no,2,right,right_low,no,no-recurrence-events
40-49,ge40,40-44,15-17,yes,2,right,left_up,yes,no-recurrence-events
50-59,premeno,25-29,0-2,no,2,left,left_low,no,no-recurrence-events
60-69,ge40,15-19,0-2,no,2,right,left_up,no,no-recurrence-events

Also if i get rid of the last two line of code ( the prediction code ) it runs fine with no errors此外,如果我去掉最后两行代码(预测代码),它运行良好,没有错误

trying adding this line anywhere above the transforms尝试在变换上方的任何位置添加此行

enc = OneHotEncoder(handle_unknown='ignore')

then change the transform lines to the following然后将变换线更改为以下

enc = enc.fit(X_train)
X_train = enc.transform(X_train)
X_test = enc.transform(X_test)

I get this error我收到这个错误

```Traceback (most recent call last):
  File "main.py", line 25, in <module>
    X_test = OneHotEncoder().transform(X_test)
  File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/sklearn/preprocessing/_encoders.py", line 726, in transform
    check_is_fitted(self, 'categories_')
  File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/sklearn/utils/validation.py", line 914, in check_is_fitted
    raise NotFittedError(msg % {'name': type(estimator).__name__})
sklearn.exceptions.NotFittedError: This OneHotEncoder instance is not fitted yet. Call 'fit' with appropriate arguments before using this method.```

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM