Pandas - 比较行中的列 ID 并有条件地删除

Question

在一个示例数据框中，如：

Qid     Sid     L1  L2
id01    id02    74  72
id01    id03    74  68
id02    id01    72  74
id02    id03    72  68

我想删除互惠命中，所以输出应该是：

Qid     Sid     L1  L2
id01    id02    74  72
id01    id03    74  68
id02    id03    72  68

在我的真实数据集中，我有数千行，以上只是为了解释这个想法。

Answer 1

这是另一个想法：

import pandas as pd
import numpy as np
data = {'Qid':['id01','id01','id02','id02'],'Sid':['id02','id02','id01','id03'],'L1':[74,74,72,72],'L2':[72,68,74,68]}
df = pd.DataFrame(data)
df[['L1','L2']] = df[['L1','L2']].astype(str) #Turn the values into strings so you can create sortable list over it.
df['aux'] = df[['Qid','Sid','L1','L2']].values.tolist() #create a list of the 4 columns
df['aux'] = df['aux'].apply(sorted).astype(str) #sort the list and treat it as a full string.
df = df.drop_duplicates(subset='aux').drop(columns='aux') #drop the rows where the list is duplicate, that is, there is the same combination of Qid, Sid, L1 and L2.
print(df)

输出：

    Qid   Sid  L1  L2
0  id01  id02  74  72
1  id01  id02  74  68
3  id02  id03  72  68

Pandas - 比较行中的列 ID 并有条件地删除

问题描述

1 个解决方案

解决方案1
2 已采纳 2020-01-21 18:28:26

Pandas - 比较行中的列 ID 并有条件地删除

问题描述

1 个解决方案

解决方案1 2 已采纳 2020-01-21 18:28:26

解决方案1
2 已采纳 2020-01-21 18:28:26