[英]How to compare two dataframe columns and extract third column value as output in python
I have two dataframes as shown below我有两个数据框,如下所示
df1: df1:
subject
laptop issue
password reset
account unlock
...
df2: df2:
key Automation
reset password 70%
lock account 50%
unlock 70%
...
I want to take first row from dataframe (df1) and check with "key" column in df2.我想从 dataframe (df1) 中取出第一行并检查 df2 中的“key”列。 check if all the words from first row appears in df2 ("key" column).检查第一行中的所有单词是否出现在 df2 (“key”列)中。 For eg laptop issue
is not there in df2, whereas password reset
appears in df2.例如,df2 中不存在laptop issue
,而 df2 中出现password reset
。
when "password reset" matches with the df2, its respective "Automation" value has to be printed as output in a separate column in df1.当“密码重置”与 df2 匹配时,其各自的“自动化”值必须在 df1 的单独列中打印为 output。
so df1 will look like:所以 df1 看起来像:
df1: df1:
subject Automation
laptop issue 0%
password reset 70%
account unlock 0%
How I should go ahead with this in python.我应该如何在 python 中提前完成 go。
words in a sentence can be in any order.句子中的单词可以按任何顺序排列。 not necessary it should be the same sentence.没必要应该是同一句话。
The below will work for exact ordered/unordered matches by creating sorted keys to match with the subjects:通过创建与主题匹配的排序键,以下内容将适用于精确的有序/无序匹配:
import pandas as pd
df = pd.DataFrame({"subject":["laptop issue","password reset","account unlock"]})
df2 = pd.DataFrame({"key":["lock account","reset password","unlock"],
"Automation":["50%","70%","70%"]})
df["new"] = df["subject"].apply(lambda x: " ".join(sorted(x.split())))
df2["new"] = df2["key"].apply(lambda x: " ".join(sorted(x.split())))
print (df.merge(df2,on="new",how="left").fillna("0%").drop(columns=['new', 'key']))
#
subject Automation
0 laptop issue 0%
1 password reset 70%
2 account unlock 0%
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.