如何比较两个 dataframe 列并将第三列值提取为 python 中的 output

Question

I have two dataframes as shown below我有两个数据框，如下所示

df1: df1：

subject
laptop issue
password reset
account unlock
...

df2: df2:

key                       Automation    
reset password              70%
lock account                50%
unlock                      70%
...

I want to take first row from dataframe (df1) and check with "key" column in df2.我想从 dataframe (df1) 中取出第一行并检查 df2 中的“key”列。 check if all the words from first row appears in df2 ("key" column).检查第一行中的所有单词是否出现在 df2 （“key”列）中。 For eg laptop issue is not there in df2, whereas password reset appears in df2.例如，df2 中不存在laptop issue ，而 df2 中出现password reset 。

when "password reset" matches with the df2, its respective "Automation" value has to be printed as output in a separate column in df1.当“密码重置”与 df2 匹配时，其各自的“自动化”值必须在 df1 的单独列中打印为 output。

so df1 will look like:所以 df1 看起来像：

df1: df1：

subject                    Automation
laptop issue                   0%
password reset                 70%
account unlock                 0%

How I should go ahead with this in python.我应该如何在 python 中提前完成 go。

words in a sentence can be in any order.句子中的单词可以按任何顺序排列。 not necessary it should be the same sentence.没必要应该是同一句话。

Answer 1

The below will work for exact ordered/unordered matches by creating sorted keys to match with the subjects:通过创建与主题匹配的排序键，以下内容将适用于精确的有序/无序匹配：

import pandas as pd

df = pd.DataFrame({"subject":["laptop issue","password reset","account unlock"]})
df2 = pd.DataFrame({"key":["lock account","reset password","unlock"],
                    "Automation":["50%","70%","70%"]})

df["new"] = df["subject"].apply(lambda x: " ".join(sorted(x.split())))
df2["new"] = df2["key"].apply(lambda x: " ".join(sorted(x.split())))

print (df.merge(df2,on="new",how="left").fillna("0%").drop(columns=['new', 'key']))

#
          subject Automation
0    laptop issue         0%
1  password reset        70%
2  account unlock         0%

如何比较两个 dataframe 列并将第三列值提取为 python 中的 output

问题描述

1 个解决方案

解决方案1
0 2019-10-21 02:26:22

如何比较两个 dataframe 列并将第三列值提取为 python 中的 output

问题描述

1 个解决方案

解决方案1 0 2019-10-21 02:26:22

解决方案1
0 2019-10-21 02:26:22