检查列的任何字符串元素是否与python中的其他列字符串列表匹配

Question

df去向

     CAR1                        CAR2
['ford','hyundai']         ['ford','hyundai']
['ford','hyundai']         ['hyundai','nissan']
['ford','hyundai']         ['bmw', 'audi']

Expected output:预期输出：

 CAR1                        CAR2                   Flag
['ford','hyundai']         ['ford','hyundai']        1
['ford','hyundai']         ['hyundai','nissan']      1
['ford','hyundai']         ['bmw', 'audi']           0

Raise flag 1 in case of any elements/string from CAR1 matches with CAR2, else raise flag 0如果 CAR1 中的任何元素/字符串与 CAR2 匹配，则提高标志 1，否则提高标志 0

My try is:我的尝试是：

df[[x in y for x,y in zip(df['CAR1'], df['CAR2'])]

Answer 1

EDIT: first convert columns to lists:编辑：首先将列转换为列表：

import ast

cols = ['CAR1','CAR2']
df[cols] = df[cols].apply(ast.literal_eval)

Use set.intersection in list comprehension with convert to boolean and integers for True,False to 1/0 mapping:在列表推导中使用set.intersection ，将True,False转换为布尔值和整数以实现1/0映射：

df['Flag'] = [int(bool(set(x).intersection(y))) for x,y in zip(df['CAR1'], df['CAR2'])]

Alternative solution:替代解决方案：

df['Flag'] = [1 if set(x).intersection(y) else 0 for x,y in zip(df['CAR1'], df['CAR2'])]

print (df)
              CAR1               CAR2  Flag
0  [ford, hyundai]    [ford, hyundai]     1
1  [ford, hyundai]  [hyundai, nissan]     1
2  [ford, hyundai]        [bmw, audi]     0

Answer 2

You can use set operations in a list comprehension ( isdisjoint returns False if the sets overlap, which is inverted and converted to integer with 1-x ):您可以在列表推导中使用set操作（如果集合重叠则isdisjoint返回 False，它被倒置并使用1-x转换为整数）：

df['Flag'] = [1-set(s1).isdisjoint(s2) for s1, s2 in zip(df['CAR1'], df['CAR2'])]

NB.注意。 isdisjoint is quite fast as it doesn't require to read the full sets, is returns False as soon as a common item is found. isdisjoint非常快，因为它不需要读取完整的集合，一旦找到公共项目就会返回 False。

Output:输出：

              CAR1               CAR2  Flag
0  [ford, hyundai]    [ford, hyundai]     1
1  [ford, hyundai]  [hyundai, nissan]     1
2  [ford, hyundai]        [bmw, audi]     0

from strings来自字符串

from ast import literal_eval

df['Flag'] = [1-set(s1).isdisjoint(s2) for s1, s2 in
               zip(df['CAR1'].apply(literal_eval),
                   df['CAR2'].apply(literal_eval))]

检查列的任何字符串元素是否与python中的其他列字符串列表匹配

问题描述

2 个解决方案

解决方案1
2 已采纳 2022-12-22 08:37:21

解决方案2
2 2022-12-22 08:37:36

from strings来自字符串

检查列的任何字符串元素是否与python中的其他列字符串列表匹配

问题描述

2 个解决方案

解决方案1 2 已采纳 2022-12-22 08:37:21

解决方案2 2 2022-12-22 08:37:36

from strings来自字符串

解决方案1
2 已采纳 2022-12-22 08:37:21

解决方案2
2 2022-12-22 08:37:36