[英]Error whilst importing csv file using pandas - python
我正在嘗試使用 pandas 讀取 csv 文件中的項目,然后對其進行編碼。
這是我的代碼:
import sklearn
from sklearn.utils import shuffle
from sklearn.neighbors import KNeighborsClassifier
import pandas as pd
import numpy as np
from sklearn import linear_model, preprocessing
data = pd.read_csv("car.data") # import in data
print(data.head()) # show the top few lines of data
le = preprocessing.LabelEncoder() # object to change data into a numerical value
buying = le.fit_transform(list(data["buying"])) # input buying column into object le
maint = le.fit_transform(list(data["maint"])) # input maint column into object le
door = le.fit_transform(list(data["door"])) # input door column into object le
persons = le.fit_transform(list(data["persons"])) # input persons column into object le
lug_boot = le.fit_transform(list(data["lug_boot"])) # input lug_boot column into object le
safety = le.fit_transform(list(data["safety"])) # input safety column into object le
cls = le.fit_transform(list(data["class"])) # input class column into object le
predict = "class" # what will be predicted
x = list(zip(buying, maint, door, persons, lug_boot, safety)) # will put all of the values into one list (x)
y = list(cls) # will convert numpy array (cls) into list
x_train, x_test, y_train, y_test = sklearn.model_selection.train_test_split(x, y, test_size = 0.1) # create new data so the machine can't memorise results
print(x_train, y_test) # show variables to test its working
還有我的car.data
文件的前幾行
buying, maint, door, persons, lug_boot, safety, class
vhigh,vhigh,2,2,small,low,unacc
vhigh,vhigh,2,2,small,med,unacc
vhigh,vhigh,2,2,small,high,unacc
vhigh,vhigh,2,2,med,low,unacc
vhigh,vhigh,2,2,med,med,unacc
vhigh,vhigh,2,2,med,high,unacc
vhigh,vhigh,2,2,big,low,unacc
vhigh,vhigh,2,2,big,med,unacc
vhigh,vhigh,2,2,big,high,unacc
我認為我做的一切都是正確的,但是我收到以下錯誤:
Traceback (most recent call last):
File "/opt/anaconda3/envs/tensor/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 2895, in get_loc
return self._engine.get_loc(casted_key)
File "pandas/_libs/index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 1675, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 1683, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'maint'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/name/PycharmProjects/Machine_learning/KNN/KNN Working File.py", line 13, in <module>
maint = le.fit_transform(list(data["maint"])) # input maint column into object le
File "/opt/anaconda3/envs/tensor/lib/python3.6/site-packages/pandas/core/frame.py", line 2906, in __getitem__
indexer = self.columns.get_loc(key)
File "/opt/anaconda3/envs/tensor/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 2897, in get_loc
raise KeyError(key) from err
KeyError: 'maint'
我最困惑的是為什么它只給了我一個關於maint
變量而不是buying
變量的錯誤。 請讓我知道我做錯了什么,因為我很困惑。 謝謝。
你在'maint'之前有一個前導空格,所以你的實際鍵應該是'maint'。
要么修復 csv 文件,要么在pd.read_csv()
中標記skipinitialspace=True
:
data = pd.read_csv("car.data", skipinitialspace=True)
這個工作文件在我結束
In [2]: !cat a.csv
buying, maint, door, persons, lug_boot, safety, class
vhigh,vhigh,2,2,small,low,unacc
vhigh,vhigh,2,2,small,med,unacc
vhigh,vhigh,2,2,small,high,unacc
vhigh,vhigh,2,2,med,low,unacc
vhigh,vhigh,2,2,med,med,unacc
vhigh,vhigh,2,2,med,high,unacc
vhigh,vhigh,2,2,big,low,unacc
vhigh,vhigh,2,2,big,med,unacc
vhigh,vhigh,2,2,big,high,unacc
In [3]: pd.read_csv("a.csv")
Out[3]:
buying maint door persons lug_boot safety class
0 vhigh vhigh 2 2 small low unacc
1 vhigh vhigh 2 2 small med unacc
2 vhigh vhigh 2 2 small high unacc
3 vhigh vhigh 2 2 med low unacc
4 vhigh vhigh 2 2 med med unacc
5 vhigh vhigh 2 2 med high unacc
6 vhigh vhigh 2 2 big low unacc
7 vhigh vhigh 2 2 big med unacc
8 vhigh vhigh 2 2 big high unacc
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.