简体   繁体   中英

Incompatible dimension for X and Y matrices

I was wondering what i have wrong here i get the error

Traceback (most recent call last):
  File "main.py", line 37, in <module>
    y_pred = knn.predict(X_test)
  File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/sklearn/neighbors/classification.py", line149, in predict
    neigh_dist, neigh_ind = self.kneighbors(X)
  File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/sklearn/neighbors/base.py", line 434, in kneighbors
    **kwds))
  File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/sklearn/metrics/pairwise.py", line 1448, in pairwise_distances_chunked
    n_jobs=n_jobs, **kwds)
  File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/sklearn/metrics/pairwise.py", line 1588, in pairwise_distances
    return _parallel_pairwise(X, Y, func, n_jobs, **kwds)
  File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/sklearn/metrics/pairwise.py", line 1206, in _parallel_pairwise
    return func(X, Y, **kwds)
  File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/sklearn/metrics/pairwise.py", line 232, ineuclidean_distances
    X, Y = check_pairwise_arrays(X, Y)
  File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/sklearn/metrics/pairwise.py", line 125, incheck_pairwise_arrays
    X.shape[1], Y.shape[1]))
ValueError: Incompatible dimension for X and Y matrices: X.shape[1] == 38 while Y.shape[1] == 43

I'm new to ai and cant find anything on the internet that really solves this problem, any comment appreciated. This is my code

from sklearn.preprocessing import OneHotEncoder
from sklearn import metrics 
from sklearn.neighbors import KNeighborsClassifier 
from sklearn.model_selection import train_test_split
import pandas as pd

fileName = "breast-cancer-fixed.csv";

df = pd.read_csv(fileName)

X = df[df.columns[:-1]] 
y = df[df.columns[-1]]  

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, random_state=1) 

X_train = OneHotEncoder().fit_transform(X_train)
X_test = OneHotEncoder().fit_transform(X_test)

knn = KNeighborsClassifier(n_neighbors=3) 
knn.fit(X_train, y_train) 

y_pred = knn.predict(X_test)
print("kNN model accuracy:", metrics.accuracy_score(y_test, y_pred)) 

My csv is massive and i cant upload it here so i put a small snippet in

age,menopause,tumor-size,inv-nodes,node-caps,deg-malig,breast,breast-quad,irradiat,Class
40-49,premeno,15-19,0-2,yes,3,right,left_up,no,recurrence-events
50-59,ge40,15-19,0-2,no,1,right,central,no,no-recurrence-events
50-59,ge40,35-39,0-2,no,2,left,left_low,no,recurrence-events
40-49,premeno,35-39,0-2,yes,3,right,left_low,yes,no-recurrence-events
40-49,premeno,30-34,3-5,yes,2,left,right_up,no,recurrence-events
50-59,premeno,25-29,3-5,no,2,right,left_up,yes,no-recurrence-events
50-59,ge40,40-44,0-2,no,3,left,left_up,no,no-recurrence-events
40-49,premeno,10-14,0-2,no,2,left,left_up,no,no-recurrence-events
40-49,premeno,0-4,0-2,no,2,right,right_low,no,no-recurrence-events
40-49,ge40,40-44,15-17,yes,2,right,left_up,yes,no-recurrence-events
50-59,premeno,25-29,0-2,no,2,left,left_low,no,no-recurrence-events
60-69,ge40,15-19,0-2,no,2,right,left_up,no,no-recurrence-events

Also if i get rid of the last two line of code ( the prediction code ) it runs fine with no errors

trying adding this line anywhere above the transforms

enc = OneHotEncoder(handle_unknown='ignore')

then change the transform lines to the following

enc = enc.fit(X_train)
X_train = enc.transform(X_train)
X_test = enc.transform(X_test)

I get this error

```Traceback (most recent call last):
  File "main.py", line 25, in <module>
    X_test = OneHotEncoder().transform(X_test)
  File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/sklearn/preprocessing/_encoders.py", line 726, in transform
    check_is_fitted(self, 'categories_')
  File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/sklearn/utils/validation.py", line 914, in check_is_fitted
    raise NotFittedError(msg % {'name': type(estimator).__name__})
sklearn.exceptions.NotFittedError: This OneHotEncoder instance is not fitted yet. Call 'fit' with appropriate arguments before using this method.```

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM