簡體   English   中英

使用 pandas 刪除 csv 文件中的特定行

[英]Deletion of a particular row in a csv file using pandas

相對較新 pandas 並嘗試從文件XYZ中刪除文件ABC中存在的每一行。

代碼:

import pandas as pd

# Reads two excel files
clm1 = pd.read_csv('ABC.csv')
clm2 = pd.read_csv('XYZ.csv')

# Prints file length
print('Main file clm2: '+ str(len(clm2['image_url'])))
print('Referral file clm1': str(len(clm1['Input.image_url'])))

for index1 in clm1.index:
    for index2 in clm2.index:
        if clm2['image_url'][index2] == clm1['Input.image_url'][index1]:
            print("Entered into deletion condition!!")

            print(clm2['image_url'][index2])
            print(clm1['Input.image_url'][index1])
            print('\n \n')

            clm2.drop(clm2['image_url'][index2], axis=0, inplace=True)
            print('Deleted!!')

print('Main file clm2: ' + str(len(clm2['image_url'])))

進入刪除條件后,它會正確打印以下行:

            print(clm2['image_url'][index2])
            print(clm1['Input.image_url'][index1])
            print('\n \n')

但是在線上出現錯誤:

clm2.drop(clm2['image_url'][index2], axis=0, inplace=True)

錯誤說:

  File "compare_delete_imagelinks.py", line 19, in <module>
    clm2.drop(clm2['image_url'][index2], axis=0, inplace=False)
  File "/Users/AjayB/anaconda3/envs/MyDjangoEnv/lib/python3.6/site-packages/pandas/core/frame.py", line 3940, in drop
    errors=errors)
  File "/Users/AjayB/anaconda3/envs/MyDjangoEnv/lib/python3.6/site-packages/pandas/core/generic.py", line 3780, in drop
    obj = obj._drop_axis(labels, axis, level=level, errors=errors)
  File "/Users/AjayB/anaconda3/envs/MyDjangoEnv/lib/python3.6/site-packages/pandas/core/generic.py", line 3812, in _drop_axis
    new_axis = axis.drop(labels, errors=errors)
  File "/Users/AjayB/anaconda3/envs/MyDjangoEnv/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 4965, in drop
    '{} not found in axis'.format(labels[mask]))
KeyError: "['https://Xxxxxxx.216PPU~V.JPG'] not found in axis"
(MyDjangoEnv) SL-SP-LAP-0384:scripts AjayB$ 

如何解決這個問題?

如果您的 csv 看起來像這樣,這應該可以工作:

XYZ.csv:

name,value
a,1
b,2
c,3
d,4
e,5
f,6

ABC.csv:

name,value
a,1
b,2
c,3
d,4

代碼:

import pandas as pd
import numpy as np

xyz = pd.read_csv("XYZ.csv", index_col='name')
abc = pd.read_csv("ABC.csv", index_col='name')

for i in abc.index:
    if i in xyz.index:
        xyz.drop(i, axis=0, inplace=True)

print(xyz)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM