[英]Python Pandas remove items from list in one column from the list in other column
I have a dataframe with two columns that both contain lists of strings.我有一个包含两列的数据框,它们都包含字符串列表。 df['col1'] and df['col2].
df['col1'] 和 df['col2]。 I am trying to remove in each row all the items contained in col2's list from col1's list and make a new column.
我试图从 col1 的列表中删除每一行中包含在 col2 列表中的所有项目并创建一个新列。 For example:
例如:
col1 col2 col3
[a, b, c] [a, b] [c]
[a, c, f, d] [a, f] [c, d]
[d, c, e, f] [d, e, f, c] []
You can create your own function to create the new lists and then use apply
on the dataframe to execute the function for each row like so:您可以创建自己的函数来创建新列表,然后在数据帧上使用
apply
为每一行执行函数,如下所示:
import pandas as pd
df = pd.DataFrame({'col1':[['a', 'b', 'c'], ['a', 'c', 'f', 'd'], ['d', 'c', 'e', 'f']],
'col2':[['a', 'b'], ['a', 'f'], ['d', 'e', 'f', 'c']]})
def func(df):
return list(set(df['col1']) - set(df['col2']))
df['col3'] = df.apply(func, axis = 1)
The function converts the lists to sets and uses set subtraction to remove values contained in col2
from col1
.该函数将列表转换为集合,并使用集合减法从
col1
删除col2
包含的值。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.