简体   繁体   English

检查列的任何字符串元素是否与python中的其他列字符串列表匹配

[英]Checking if any string element of the column is matching with other column string list in python

df去向

     CAR1                        CAR2
['ford','hyundai']         ['ford','hyundai']
['ford','hyundai']         ['hyundai','nissan']
['ford','hyundai']         ['bmw', 'audi']

Expected output:预期输出:

 CAR1                        CAR2                   Flag
['ford','hyundai']         ['ford','hyundai']        1
['ford','hyundai']         ['hyundai','nissan']      1
['ford','hyundai']         ['bmw', 'audi']           0 

Raise flag 1 in case of any elements/string from CAR1 matches with CAR2, else raise flag 0如果 CAR1 中的任何元素/字符串与 CAR2 匹配,则提高标志 1,否则提高标志 0

My try is:我的尝试是:

df[[x in y for x,y in zip(df['CAR1'], df['CAR2'])]

EDIT: first convert columns to lists:编辑:首先将列转换为列表:

import ast

cols = ['CAR1','CAR2']
df[cols] = df[cols].apply(ast.literal_eval)

Use set.intersection in list comprehension with convert to boolean and integers for True,False to 1/0 mapping:在列表推导中使用set.intersection ,将True,False转换为布尔值和整数以实现1/0映射:

df['Flag'] = [int(bool(set(x).intersection(y))) for x,y in zip(df['CAR1'], df['CAR2'])]

Alternative solution:替代解决方案:

df['Flag'] = [1 if set(x).intersection(y) else 0 for x,y in zip(df['CAR1'], df['CAR2'])]

print (df)
              CAR1               CAR2  Flag
0  [ford, hyundai]    [ford, hyundai]     1
1  [ford, hyundai]  [hyundai, nissan]     1
2  [ford, hyundai]        [bmw, audi]     0

You can use set operations in a list comprehension ( isdisjoint returns False if the sets overlap, which is inverted and converted to integer with 1-x ):您可以在列表推导中使用set操作(如果集合重叠则isdisjoint返回 False,它被倒置并使用1-x转换为整数):

df['Flag'] = [1-set(s1).isdisjoint(s2) for s1, s2 in zip(df['CAR1'], df['CAR2'])]

NB.注意。 isdisjoint is quite fast as it doesn't require to read the full sets, is returns False as soon as a common item is found. isdisjoint非常快,因为它不需要读取完整的集合,一旦找到公共项目就会返回 False。

Output:输出:

              CAR1               CAR2  Flag
0  [ford, hyundai]    [ford, hyundai]     1
1  [ford, hyundai]  [hyundai, nissan]     1
2  [ford, hyundai]        [bmw, audi]     0

from strings来自字符串

from ast import literal_eval

df['Flag'] = [1-set(s1).isdisjoint(s2) for s1, s2 in
               zip(df['CAR1'].apply(literal_eval),
                   df['CAR2'].apply(literal_eval))]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在检查列值是否包含作为列表中元素的字符串后,如何将列表中的元素分配给数据框列? (Python) - How to assign element from a list to a dataframe column after checking if a column value contains a string that is an element in the list? (Python) Python:从另一列的列表中替换一列中的字符串 - Python: Replace string in one column from list in other column 复杂的列表列到列字符串匹配并派生另一列 - complicated list column to column string matching and deriving another column python panda:在列中查找特定字符串并填充与字符串匹配的列 - python panda: find a specific string in a column and fill the column matching the string 使用列表中的匹配字符串更改列名 - Change column name with matching string from list 使用Python检查列表中是否存在字符串元素? - Checking if string element is present in a list with Python? Python:列表的pandas列上的字符串匹配 - Python: String matching on a pandas column of lists 将 dataframe 中的每个字符串元素与一个列表进行比较,并将其分配给一个列 python pandas - Compare each string element in a dataframe to a list and assign it to a column, python pandas Python:熊猫列中的部分字符串匹配并从熊猫数据框中的其他列中检索值 - Python: Partial String matching in pandas column and retrieve the values from other columns in pandas dataframe 熊猫:使用Fuzzywuzzy匹配字符串并检索其他列的值 - Pandas: Matching string using fuzzywuzzy and retrieve the value of other column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM