如果列值的组合等于列表中的元组，则删除Pandas中的dataFrame行

Question

I currently do this to delete a row that has a specific column 'some_column' value that is found in a list removal_list 我目前这样做是为了删除列表removal_list中具有特定列'some_column'值的removal_list

df = df[~df['some_column'].isin(removal_list)]

How can I do this if I want to compare a combination of values in say a list of tuples ? 如果我想比较元组列表中的值组合，我该怎么做？ (doesn't necessarily need to be a list of tuples if there is a better way to achieve this) （如果有更好的方法可以实现这一点，则不一定需要是元组列表）

for example: 例如：

removal_list = [(item1,store1),(item2,store1),(item2,store2)]

if df['column_1'] and df['column_2'] of a specific row have values item1 and store1 (or any other tuple in removal_list ), then delete that row 如果df['column_1']和df['column_2']的特定行的具有值item1和store1 （或任何其他元组removal_list ），然后删除该行

also, it might be that there are more than two columns that need to be assessed 另外，可能需要评估两列以上的列

EDIT better example: 编辑更好的例子：

client  account_type    description
0   1   2   photographer
1   2   2   banker
2   3   3   banker
3   4   2   journalist
4   5   4   journalist

remove_list = [(2,journalist),(3,banker)]

check on columns account_type and description 检查列account_type和description

Output: 输出：

client  account_type    description
0   1   2   photographer
1   2   2   banker
4   5   4   journalist

Answer 1

Say you have 说你有

removal_list = [(item1,store1),(item2,store1),(item2,store2)]

Then 然后

df[['column_1', 'column_2']].apply(tuple, axis=1)

should create a Series of tuples, and so 应该创建一系列元组，等等

df[['column_1', 'column_2']].apply(tuple, axis=1).isin(removal_list)

is the binary condition you're after. 是你所追求的二元条件。 Removal is the same as you did before. 删除与以前一样。 This should work for any number of columns. 这适用于任意数量的列。

Example 例

df = pd.DataFrame({'a': [1, 2], 'b': [3, 4]})
>>> df[['a', 'b']].apply(tuple, axis=1).isin([(1, 3), (30, 40)])
0    (1, 3)
1    (2, 4)
dtype: object

Answer 2

I suggest create DataFrame and merge with default inner join: 我建议创建DataFrame并与默认的内部DataFrame merge ：

remove_list = [(2,'journalist'),(3,'banker')]

df1 = pd.DataFrame(remove_list, columns=['account_type','description'])
print (df1)
   account_type description
0             2  journalist
1             3      banker

df = df.merge(df1, how='outer', indicator=True).query('_merge != "both"').drop('_merge', 1)
print (df)
   client  account_type   description
0       1             2  photographer
1       2             2        banker
4       5             4    journalist

Answer 3

One way is to create a series from zipping 2 columns, then use Boolean indexing. 一种方法是从压缩2列创建一个系列，然后使用布尔索引。 I also advise you use set instead of list for O(1) lookup. 我还建议您使用set而不是list进行O（1）查找。

remove_set = {(2,'journalist'),(3,'banker')}

condition = pd.Series(list(zip(df.account_type, df.description))).isin(remove_set)

res = df[~condition]

print(res)

   client  account_type   description
0       1             2  photographer
1       2             2        banker
4       5             4    journalist

Answer 4

If the index was set to be ['account_type', 'description'] , we could use the drop method. 如果索引设置为['account_type', 'description'] ，我们可以使用drop方法。

df.set_index(['account_type', 'description']).drop(remove_list).reset_index()

   account_type   description  client
0             2  photographer       1
1             2        banker       2
2             4    journalist       5

Answer 5

You could use the query method with an extra column to select against. 您可以使用带有额外列的查询方法来进行选择。

removal_list = [(item1,store1),(item2,store1),(item2,store2)]

df['removal_column'] = df.apply(lambda x: (x.account_type, x.description), axis='columns')
df = df.query('removal_column not in @removal_list').drop('removal_column', axis='columns')

如果列值的组合等于列表中的元组，则删除Pandas中的dataFrame行

问题描述

5 个解决方案

解决方案1
4 已采纳 2018-05-16 12:11:29

解决方案2
2 2018-05-16 12:18:11

解决方案3
2 2018-05-16 12:26:07

解决方案4
2 2018-05-16 13:13:30

解决方案5
0 2018-05-16 12:39:36

如果列值的组合等于列表中的元组，则删除Pandas中的dataFrame行

问题描述

5 个解决方案

解决方案1 4 已采纳 2018-05-16 12:11:29

解决方案2 2 2018-05-16 12:18:11

解决方案3 2 2018-05-16 12:26:07

解决方案4 2 2018-05-16 13:13:30

解决方案5 0 2018-05-16 12:39:36

解决方案1
4 已采纳 2018-05-16 12:11:29

解决方案2
2 2018-05-16 12:18:11

解决方案3
2 2018-05-16 12:26:07

解决方案4
2 2018-05-16 13:13:30

解决方案5
0 2018-05-16 12:39:36