根據對值刪除 pandas dataframe 中的行

Question

我有 dataframe 如下：

df = pd.DataFrame({'User':['a','a','a','b','b','b'],
                 'Type':['101','102','101','101','101','102'],
                 'Qty':[10, -10, 10, 30, 5, -5]})

我想刪除 df['Type'] = 101 和 102 的對值，其中 df['Qty'] 相互抵消。 最終結果將是這樣的：

df = pd.DataFrame({'User':['a','b'],
                     'Type':['101', '101'],
                     'Qty':[10, 30})

我試圖將負值轉換為絕對數並刪除重復項：

df['Qty'] = df['Qty'].abs()
df.drop_duplicates(subset=['Qty'], keep='first')

但后來它錯誤地給了我這樣的 dataframe：

df = pd.DataFrame({'User':['a','b', 'b'],
                     'Type':['101', '101', '101'],
                     'Qty':[10, 30, 5})

Answer 1

想法是為每個組創建索引值的組合，並測試每個子組是否同時包含Type s，並且對於 set ot 這個匹配對，sum 是否為0 ：

#solution need unique index values
df = df.reset_index(drop=True)

from  itertools import combinations
    
out = set()
def f(x):
    for i in combinations(x.index, 2):
        a = x.loc[list(i)]
        if (set(a['Type']) == set(['101','102'])) and (a['Qty'].sum() == 0):
           out.add(i)

df.groupby('User').apply(f)

print (out)
{(0, 1), (4, 5), (1, 2)}

如果重復某些值，則刪除所有對，例如此處(1,2) ：

s = pd.Series(list(out)).explode()
idx = s.index[s.duplicated()]
final = s.drop(idx)
print (final)
0    0
0    1
1    4
1    5
dtype: object

最后從原始中刪除行：

df = df.drop(final)
print (df)
  User Type  Qty
2    a  101   10
3    b  101   30

Answer 2

如果只有兩個'Type' ^{^{（在本例中為101和102 ）}} ，那么您可以編寫自定義 function ，如下所示：

使用包含'Qty'絕對值的鍵構建字典。
字典的值包含對應於'Qty'的'Type'值列表。

from collections import defaultdict
def f(x):
    new = defaultdict(list)
    for k,v in x[['Type', 'Qty']].itertuples(index=None,name=None):
        if not new[abs(v)]:
            new[abs(v)].append(k)
        elif new[abs(v)][-1] !=k:
            new[abs(v)].pop()
        else:
            new[abs(v)].append(k)
    return pd.Series(new,name='Qty').rename_axis(index='Type')

邏輯很簡單：

每當遇到新鍵時，將其對應'Type'添加到列表中。
如果它已經存在，則檢查最后一個值，即前面添加'Type'是否等於當前'Type'值。 例如，如果它們都不匹配，如果new = {10:['101']}並且當前鍵是'102'則刪除'101' 。 所以， new = {10:[]}
如果它的鍵已經存在並且最后一個'Type'和當前'Type'匹配，只需 append 當前'Type'到列表中，例如，如果new = {10:['101']}並且當前'Type'是'101'然后 append 到它。 因此， new = {10:['101', '101']} 。

df.groupby('User').apply(f).explode().dropna().reset_index()

  User  Type  Qty
0    a    10  101
1    b    30  101

Answer 3

遍歷所有記錄並將匹配項保存在一個列表中，以確保沒有索引多次配對似乎在這里工作。


import pandas as pd

df = pd.DataFrame({'User':['a','a','a','b','b','b'],
                 'Type':['101','102','101','101','101','102'],
                 'Qty':[10, -10, 10, 30, 5, -5]})



# create a list to collect all indices that we are going to remove
records_to_remove = []
# a dictionary to map which group mirrors the other
pair = {'101': '102', '102':'101'}

# let's go over each row one by one,
for i in df.index:
    current_record = df.iloc[i]
    # if we haven't stored this index already for removal
    if i not in records_to_remove:
        pair_type = pair[current_record['Type']]
        pair_quantity = -1*current_record['Qty']
        # search for all possible matches to this row
        match_records = df[(df['Type']==pair_type) & (df['Qty']==pair_quantity)]
        if match_records.empty:
            # if no matches fond move on to the next row
            continue
        else:
            # if a match is found, take the first of such records
            first_match_index = match_records.index[0]
            if first_match_index not in records_to_remove:
                # store the indices in the list to remove only if they're not already present
                records_to_remove.append(i)
                records_to_remove.append(first_match_index)
                
df = df.drop(records_to_remove)

Output：

   User Type  Qty
2     a  101   10
3     b  101   30

看看這是否適合你！

根據對值刪除 pandas dataframe 中的行

問題描述

3 個解決方案

解決方案1
3 已采納 2020-07-02 06:25:49

解決方案2
2 2020-07-02 09:18:53

解決方案3
2 2020-07-02 09:22:31

根據對值刪除 pandas dataframe 中的行

問題描述

3 個解決方案

解決方案1 3 已采納 2020-07-02 06:25:49

解決方案2 2 2020-07-02 09:18:53

解決方案3 2 2020-07-02 09:22:31

解決方案1
3 已采納 2020-07-02 06:25:49

解決方案2
2 2020-07-02 09:18:53

解決方案3
2 2020-07-02 09:22:31