![](/img/trans.png)
[英]Removing rows from csv file before a particular row based on values in that row using Pandas
[英]Deletion of a particular row in a csv file using pandas
相对较新 pandas 并尝试从文件XYZ中删除文件ABC中存在的每一行。
代码:
import pandas as pd
# Reads two excel files
clm1 = pd.read_csv('ABC.csv')
clm2 = pd.read_csv('XYZ.csv')
# Prints file length
print('Main file clm2: '+ str(len(clm2['image_url'])))
print('Referral file clm1': str(len(clm1['Input.image_url'])))
for index1 in clm1.index:
for index2 in clm2.index:
if clm2['image_url'][index2] == clm1['Input.image_url'][index1]:
print("Entered into deletion condition!!")
print(clm2['image_url'][index2])
print(clm1['Input.image_url'][index1])
print('\n \n')
clm2.drop(clm2['image_url'][index2], axis=0, inplace=True)
print('Deleted!!')
print('Main file clm2: ' + str(len(clm2['image_url'])))
进入删除条件后,它会正确打印以下行:
print(clm2['image_url'][index2])
print(clm1['Input.image_url'][index1])
print('\n \n')
但是在线上出现错误:
clm2.drop(clm2['image_url'][index2], axis=0, inplace=True)
错误说:
File "compare_delete_imagelinks.py", line 19, in <module>
clm2.drop(clm2['image_url'][index2], axis=0, inplace=False)
File "/Users/AjayB/anaconda3/envs/MyDjangoEnv/lib/python3.6/site-packages/pandas/core/frame.py", line 3940, in drop
errors=errors)
File "/Users/AjayB/anaconda3/envs/MyDjangoEnv/lib/python3.6/site-packages/pandas/core/generic.py", line 3780, in drop
obj = obj._drop_axis(labels, axis, level=level, errors=errors)
File "/Users/AjayB/anaconda3/envs/MyDjangoEnv/lib/python3.6/site-packages/pandas/core/generic.py", line 3812, in _drop_axis
new_axis = axis.drop(labels, errors=errors)
File "/Users/AjayB/anaconda3/envs/MyDjangoEnv/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 4965, in drop
'{} not found in axis'.format(labels[mask]))
KeyError: "['https://Xxxxxxx.216PPU~V.JPG'] not found in axis"
(MyDjangoEnv) SL-SP-LAP-0384:scripts AjayB$
如何解决这个问题?
如果您的 csv 看起来像这样,这应该可以工作:
XYZ.csv:
name,value
a,1
b,2
c,3
d,4
e,5
f,6
ABC.csv:
name,value
a,1
b,2
c,3
d,4
代码:
import pandas as pd
import numpy as np
xyz = pd.read_csv("XYZ.csv", index_col='name')
abc = pd.read_csv("ABC.csv", index_col='name')
for i in abc.index:
if i in xyz.index:
xyz.drop(i, axis=0, inplace=True)
print(xyz)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.