[英]Removing values form a list in pandas dataframe column based on another list
I have a column in a dataframe which contains lists. 我在包含列表的数据框中有一列。 I want to be able to remove elements from these lists based on elements that I have in another list (as shown below). 我希望能够基于另一个列表中的元素从这些列表中删除元素(如下所示)。
I tried to use list comprehension but it seems to give no result. 我尝试使用列表理解,但似乎没有任何结果。
import pandas as pd
sys_list = ['sys1', 'sys2', 'sys3']
df = pd.DataFrame({'A':[['sys1', 'sys2', 'user1'],
['user3', 'user6', 'user1'],
['sys1', 'sys2', 'sys3']]})
df['A'] = [item for item in df['A'] if item not in sys_list]
print(df)
A
0 [sys1, sys2, user1]
1 [user3, user6, user1]
2 [sys1, sys2, sys3]
I need to achieve this: 我需要实现以下目标:
A
0 [user1]
1 [user3, user6, user1]
2 []
Any thoughts? 有什么想法吗?
with apply
: 与apply
:
df.A.apply(lambda x: [i for i in x if i not in sys_list])
0 [user1]
1 [user3, user6, user1]
2 []
Name: A, dtype: object
Use Series.apply
: 使用Series.apply
:
df['B'] = df['A'].apply(lambda x: [item for item in x if item not in set(sys_list)])
print (df)
A B
0 [sys1, sys2, user1] [user1]
1 [user3, user6, user1] [user3, user6, user1]
2 [sys1, sys2, sys3] []
Or similar list comprehension like deleted answer: 或类似的列表理解,如删除的答案:
df['B'] = [[item for item in l if item not in set(sys_list)] for l in df['A']]
Or solution with set
s with set.difference
: 或者用set
的set.difference
解决方案:
df['B'] = df['A'].map(set(sys_list).difference).map(list)
You may use sets
for a better performance (this approach assumes that the order within the lists is not important, as it will change): 您可以使用sets
来获得更好的性能(此方法假定列表中的顺序并不重要,因为它会发生变化):
sys_set = set(['sys1', 'sys2', 'sys3'])
df['A'] = (df.A.map(set)-sys_set).map(list)
print(df)
A
0 [user1]
1 [user6, user1, user3]
2 []
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.