[英]Checking if any string element of the column is matching with other column string list in python
df去向
CAR1 CAR2
['ford','hyundai'] ['ford','hyundai']
['ford','hyundai'] ['hyundai','nissan']
['ford','hyundai'] ['bmw', 'audi']
Expected output:预期输出:
CAR1 CAR2 Flag
['ford','hyundai'] ['ford','hyundai'] 1
['ford','hyundai'] ['hyundai','nissan'] 1
['ford','hyundai'] ['bmw', 'audi'] 0
Raise flag 1 in case of any elements/string from CAR1 matches with CAR2, else raise flag 0如果 CAR1 中的任何元素/字符串与 CAR2 匹配,则提高标志 1,否则提高标志 0
My try is:我的尝试是:
df[[x in y for x,y in zip(df['CAR1'], df['CAR2'])]
EDIT: first convert columns to lists:编辑:首先将列转换为列表:
import ast
cols = ['CAR1','CAR2']
df[cols] = df[cols].apply(ast.literal_eval)
Use set.intersection
in list comprehension with convert to boolean and integers for True,False
to 1/0
mapping:在列表推导中使用
set.intersection
,将True,False
转换为布尔值和整数以实现1/0
映射:
df['Flag'] = [int(bool(set(x).intersection(y))) for x,y in zip(df['CAR1'], df['CAR2'])]
Alternative solution:替代解决方案:
df['Flag'] = [1 if set(x).intersection(y) else 0 for x,y in zip(df['CAR1'], df['CAR2'])]
print (df)
CAR1 CAR2 Flag
0 [ford, hyundai] [ford, hyundai] 1
1 [ford, hyundai] [hyundai, nissan] 1
2 [ford, hyundai] [bmw, audi] 0
You can use set
operations in a list comprehension ( isdisjoint
returns False if the sets overlap, which is inverted and converted to integer with 1-x
):您可以在列表推导中使用
set
操作(如果集合重叠则isdisjoint
返回 False,它被倒置并使用1-x
转换为整数):
df['Flag'] = [1-set(s1).isdisjoint(s2) for s1, s2 in zip(df['CAR1'], df['CAR2'])]
NB.注意。
isdisjoint
is quite fast as it doesn't require to read the full sets, is returns False as soon as a common item is found. isdisjoint
非常快,因为它不需要读取完整的集合,一旦找到公共项目就会返回 False。
Output:输出:
CAR1 CAR2 Flag
0 [ford, hyundai] [ford, hyundai] 1
1 [ford, hyundai] [hyundai, nissan] 1
2 [ford, hyundai] [bmw, audi] 0
from ast import literal_eval
df['Flag'] = [1-set(s1).isdisjoint(s2) for s1, s2 in
zip(df['CAR1'].apply(literal_eval),
df['CAR2'].apply(literal_eval))]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.