pandas - 如果存在部分字符串匹配，则将值放入新列

Question

I've got a tricky problem in pandas to solve.我在 pandas 中有一个棘手的问题要解决。 I was previously referred to this thread as a solution but it is not what I am looking for.我之前曾将此线程称为解决方案，但它不是我想要的。

Take this example dataframe with two columns:以具有两列的 dataframe 为例：

df = pd.DataFrame([['Mexico', 'Chile'], ['Nicaragua', 'Nica'], ['Colombia', 'Mex']], columns = ["col1", "col2"])

I first want to check each row in column 2 to see if that value exists in column 1. This is checking full and partial strings.我首先要检查第 2 列中的每一行以查看该值是否存在于第 1 列中。这是检查完整和部分字符串。

df['compare'] = df['col2'].apply(lambda x: 'Yes' if df['col1'].str.contains(x).any() else 'No')

I can check to see that I have a match of a partial or full string, which is good but not quite what I need.我可以检查我是否有部分或完整字符串的匹配项，这很好，但不是我需要的。 Here is what the dataframe looks like now:这是 dataframe 现在的样子：

What I really want is the value from column 1 which the value in column 2 matched with.我真正想要的是第 1 列中的值与第 2 列中的值匹配。 I have not been able to figure out how to associate them我一直无法弄清楚如何将它们关联起来

My desired result looks like this:我想要的结果如下所示：

Answer 1

Here's a "pandas-less" way to do it.这是一种“无熊猫”的方法。 Probably not very efficient but it gets the job done:可能效率不高，但它完成了工作：

def compare_cols(match_col, partial_col):
    series = []
    for partial_str in partial_col:
        for match_str in match_col:
            if partial_str in match_str:
                series.append(match_str)
                break  # matches to the first value found in match_col
        else:  # for loop did not break = no match found
                series.append(None)
    return series

df = pd.DataFrame([['Mexico', 'Chile'], ['Nicaragua', 'Nica'], ['Colombia', 'Mex']], columns = ["col1", "col2"])

df['compare'] = compare_cols(match_col=df.col1, partial_col=df.col2)

Note that if a string in col2 matches to more than one string in col1 , the first occurrence is used.请注意，如果col2中的字符串与col1中的多个字符串匹配，则使用第一个匹配项。

pandas - 如果存在部分字符串匹配，则将值放入新列

问题描述

1 个解决方案

解决方案1
2 已采纳 2021-02-25 14:11:24

pandas - 如果存在部分字符串匹配，则将值放入新列

问题描述

1 个解决方案

解决方案1 2 已采纳 2021-02-25 14:11:24

解决方案1
2 已采纳 2021-02-25 14:11:24