比较两个句子数据框并返回第三个

Question

我想比较两个长的 Dataframe 列的句子，并返回第三个 dataframe 看起来像这样。 快照如下所示。

我的第一种方法是冗长的，只适用于单个实例，但是当我将它应用于 dataframe 时失败了。 可以在上一个问题中找到。

逻辑是对于 c1 和 c2 中的字，新值 =1，对于仅 c1 中的字，值设置为零。


sentences = tra_df['Sent1']
context = tra_df['Sent2']

Sent1[0] = "I am completely happy with the plan you have laid out today"
Sent2[0] = 'the plan you have laid out today'
c3 = ['0', '0', '0', '0' , '0', '1', '1', '1', '1', '1', '1']

Answer 1

根据我对您的问题的理解，这是解决方案。

def get_common_words(c1, c2):
    res = [0]*len(c1.split())
    for idx, existing_word in enumerate(c1.split()):
        if existing_word in c2.split():
            res[idx] = 1
    return res

get_common_words(c1, c2)

如果你想让它适用于 pandas dataframe

def get_common_words_df(row):
   c1 = row['Sent1']
   c2 = row['Sent2']
   return get_common_words(c1, c2)


df['sent3'] = df.apply(get_common_words_df, axis=1)

你可以优化很多

比较两个句子数据框并返回第三个

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-04-20 16:20:53

比较两个句子数据框并返回第三个

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-04-20 16:20:53

解决方案1
1 已采纳 2020-04-20 16:20:53