繁体   English   中英

使用 pandas 删除 csv 文件中的特定行

[英]Deletion of a particular row in a csv file using pandas

相对较新 pandas 并尝试从文件XYZ中删除文件ABC中存在的每一行。

代码:

import pandas as pd

# Reads two excel files
clm1 = pd.read_csv('ABC.csv')
clm2 = pd.read_csv('XYZ.csv')

# Prints file length
print('Main file clm2: '+ str(len(clm2['image_url'])))
print('Referral file clm1': str(len(clm1['Input.image_url'])))

for index1 in clm1.index:
    for index2 in clm2.index:
        if clm2['image_url'][index2] == clm1['Input.image_url'][index1]:
            print("Entered into deletion condition!!")

            print(clm2['image_url'][index2])
            print(clm1['Input.image_url'][index1])
            print('\n \n')

            clm2.drop(clm2['image_url'][index2], axis=0, inplace=True)
            print('Deleted!!')

print('Main file clm2: ' + str(len(clm2['image_url'])))

进入删除条件后,它会正确打印以下行:

            print(clm2['image_url'][index2])
            print(clm1['Input.image_url'][index1])
            print('\n \n')

但是在线上出现错误:

clm2.drop(clm2['image_url'][index2], axis=0, inplace=True)

错误说:

  File "compare_delete_imagelinks.py", line 19, in <module>
    clm2.drop(clm2['image_url'][index2], axis=0, inplace=False)
  File "/Users/AjayB/anaconda3/envs/MyDjangoEnv/lib/python3.6/site-packages/pandas/core/frame.py", line 3940, in drop
    errors=errors)
  File "/Users/AjayB/anaconda3/envs/MyDjangoEnv/lib/python3.6/site-packages/pandas/core/generic.py", line 3780, in drop
    obj = obj._drop_axis(labels, axis, level=level, errors=errors)
  File "/Users/AjayB/anaconda3/envs/MyDjangoEnv/lib/python3.6/site-packages/pandas/core/generic.py", line 3812, in _drop_axis
    new_axis = axis.drop(labels, errors=errors)
  File "/Users/AjayB/anaconda3/envs/MyDjangoEnv/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 4965, in drop
    '{} not found in axis'.format(labels[mask]))
KeyError: "['https://Xxxxxxx.216PPU~V.JPG'] not found in axis"
(MyDjangoEnv) SL-SP-LAP-0384:scripts AjayB$ 

如何解决这个问题?

如果您的 csv 看起来像这样,这应该可以工作:

XYZ.csv:

name,value
a,1
b,2
c,3
d,4
e,5
f,6

ABC.csv:

name,value
a,1
b,2
c,3
d,4

代码:

import pandas as pd
import numpy as np

xyz = pd.read_csv("XYZ.csv", index_col='name')
abc = pd.read_csv("ABC.csv", index_col='name')

for i in abc.index:
    if i in xyz.index:
        xyz.drop(i, axis=0, inplace=True)

print(xyz)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM