[英]merge one pandas dataframe to another and remove value which is present in first dataframe from second dataframe
[英]Compare 2 list columns in a pandas dataframe. Remove value from one list if present in another
假设我有 2 个列表列,如下所示:
group1 = [['John', 'Mark'], ['Ben', 'Johnny'], ['Sarah', 'Daniel']]
group2 = [['Aya', 'Boa'], ['Mab', 'Johnny'], ['Sarah', 'Peter']]
df = pd.DataFrame({'group1':group1, 'group2':group2})
我想比较两个列表列并从group1
中删除列表元素(如果它们存在于group2
中)。 所以上面的预期结果:
group1 group2
['John', 'Mark'] ['Aya', 'Boa']
['Ben'] ['Mab', 'Johnny']
['Daniel'] ['Sarah', 'Peter']
我怎样才能做到这一点? 我试过这个:
df['group1'] = [[name for name in df['group1'] if name not in df['group2']]]
但是出现错误:
TypeError: unhashable type: 'list'
请帮忙。
您可以使用设置差异:
df.apply(lambda x: set(x['group1']).difference(x['group2']), axis=1)
Output:
0 {John, Mark}
1 {Ben}
2 {Daniel}
dtype: object
要获取列表,您可以在末尾添加.apply(list)
。
您可以在 lambda function 中使用循环:
df['group1']=df[['group1','group2']].apply(lambda x: [i for i in x['group1'] if i not in x['group2']],axis=1)
print(df)
'''
group1 group2
0 [John, Mark] [Aya, Boa]
1 [Ben] [Mab, Johnny]
2 [Daniel] [Sarah, Peter]
'''
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.