[英]Reading and Using a CSV file in python 3 panda
我有CSV档案
Firstname Lastname City Province
'Guy', 'Ouell', 'Brossard','QC'
'Michelle', 'Balonne','Stittsville','ON'
'Ben', 'Sluzing','Toronto','ON'
'Theodora', 'Panapoulos','Saint-Constant','QC'
'Kathleen', 'Mercier','St Johns','NL'
...
然后我打开并检查,这一切都很好:
df = pd.read_csv('a.csv')
df.head(n=5)
当我想使用列时,我有两个不同的问题:
问题1:只有我有权访问第一列,而当我想使用其他列时,会出现错误:
for mis_column, mis_row in missing_df.iterrows():
print(mis_row['Firstname'])
我得到了所有的名字,但是例如,当我想获得所有的城市时,我看到:
TypeError Traceback (most recent call last)
E:\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_value(self, series, key)
2482 try:
-> 2483 return libts.get_value_box(s, key)
2484 except IndexError:
pandas/_libs/tslib.pyx in pandas._libs.tslib.get_value_box
(pandas\_libs\tslib.c:18843)()
pandas/_libs/tslib.pyx in pandas._libs.tslib.get_value_box
(pandas\_libs\tslib.c:18477)()
TypeError: 'str' object cannot be interpreted as an integer
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last)
<ipython-input-36-55ba81245685> in <module>()
1
2 for mis_column, mis_row in missing_df.iterrows():
----> 3 print(mis_row['City'])
4
5
E:\Anaconda3\lib\site-packages\pandas\core\series.py in __getitem__(self, key)
599 key = com._apply_if_callable(key, self)
600 try:
--> 601 result = self.index.get_value(self, key)
602
603 if not is_scalar(result):
E:\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in
get_value(self, series, key)
2489 raise InvalidIndexError(key)
2490 else:
-> 2491 raise e1
2492 except Exception: # pragma: no cover
2493 raise e1
E:\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_value(self, series, key)
2475 try:
2476 return self._engine.get_value(s, k,
-> 2477 tz=getattr(series.dtype, 'tz', None))
2478 except KeyError as e1:
2479 if len(self) > 0 and self.inferred_type in ['integer', 'boolean']:
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_value()
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_value()
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas\_libs\hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 'City'
问题2:
for mis_column, mis_row in df.iterrows():
if mis_row['Firstname'] == 'Guy':
print('A')
不打印A
提前致谢
用CSV的标头逗号分隔。 像这样,
Firstname, Lastname, City, Province
'Guy', 'Ouell', 'Brossard','QC'
'Michelle', 'Balonne','Stittsville','ON'
'Ben', 'Sluzing','Toronto','ON'
'Theodora', 'Panapoulos','Saint-Constant','QC'
'Kathleen', 'Mercier','St John's','NL'
由于CSV周围有空格,因此您可以跳过以下内容来读取数据框,
df = pd.read_csv('<your_input>.csv', skipinitialspace=True)
如果您也要删除单引号,那么,
df = pd.read_csv('<your_input>.csv', skipinitialspace=True, quotechar="'")
>>> df
Firstname Lastname City Province
0 Guy Ouell Brossard QC
1 Michelle Balonne Stittsville ON
2 Ben Sluzing Toronto ON
3 Theodora Panapoulos Saint-Constant QC
4 Kathleen Mercier St Johns' NL
>>> import pandas as pd
>>> df = pd.read_csv('test2.csv', skipinitialspace=True, quotechar="'")
>>> df
Firstname Lastname City Province
0 Guy Ouell Brossard QC
1 Michelle Balonne Stittsville ON
2 Ben Sluzing Toronto ON
3 Theodora Panapoulos Saint-Constant QC
4 Kathleen Mercier St Johns' NL
>>> for mis_column, mis_row in df.iterrows():
... if mis_row['Firstname'] == 'Guy':
... print('A')
...
A
>>>
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.