如何用适当的字典值替换句子中的字符串？

Question

我有一本字典如下：

dict_ = { 
        'the school in USA' : 'some_text_1',
        'school' : 'some_text_1',
        'the holy church in brisbane' : 'some_text_2',
        'holy church' : 'some_text_2'
}

和一个句子列表如下：

text_sent = ["Ram is going to the holy church in brisbane",\
             "John is going to holy church", \
             "shena is going to the school in USA", \
             "Jennifer is going to the school"]

我想用 text_sent 中的相应值替换出现的 dict_ 字典键。 我这样做如下：

for ind, text in enumerate(text_sent) :
    for iterator in dict_.keys() :
        if iterator in text : 
            text_sent[ind] = re.sub(iterator, dict_[iterator], text)

for i in text_sent:
    print(i)

我得到的Output如下：

Ram is going to the some_text_2 in brisbane
John is going to some_text_2
shena is going to the some_text_1 in USA
Jennifer is going to the some_text_1

预期的 output 是：

Ram is going to some_text_2
John is going to some_text_2
shena is going to some_text_1
Jennifer is going to some_text_1

我需要的是，较长的字符串（例如，“布里斯班的圣堂”）需要更换，如果句子中没有完整的字符串，只有较小的版本（例如，在text_sent的句子中替换相应的值时，应该使用 ' Holy Church ') 而不是较长的那个。

Answer 1

您可以使用re.sub进行替换，使用str.join格式化 ZE83AED3DDF4667DEC0DAAAACB2BB3BE0BZ 字典中的正则表达式：

import re
d = {'the school in USA': 'some_text_1', 'school': 'some_text_1', 'the holy church in brisbane': 'some_text_2', 'holy church': 'some_text_2'}
text_sent = ["Ram is going to the holy church in brisbane",\
         "John is going to holy church", \
         "shena is going to the School in USA", \
         "Jennifer is going to the school"]

r = [re.sub('|'.join(d), lambda x:d[x.group()], i, re.I) for i in text_sent]

Output：

['Ram is going to some_text_2', 'John is going to some_text_2', 'shena is going to some_text_1', 'Jennifer is going to the some_text_1']

Answer 2

您可以为 dict 创建一个辅助列表并根据其元素长度对其进行排序。

dict_ = {'the school in USA' : 'some_text_1',
         'school' : 'some_text_1',
         'the holy church in brisbane' : 'some_text_2',
         'holy church' : 'some_text_2'}

text_sent = ["Ram is going to the holy church in brisbane",
             "John is going to holy church",
             "shena is going to the school in USA",
             "Jennifer is going to the school"]

dict_keys = list(dict_.keys())
dict_keys.sort(key=len)
dict_keys.reverse()

text_sent_replaced = []
for text in text_sent:
    modified_text = text
    for key in dict_:
        modified_text = modified_text.replace(key,dict_[key])
    text_sent_replaced.append(modified_text)

print(text_sent_replaced)

Answer 3

主要问题是您没有添加break语句。 如果稍后在dict_字典中有多个匹配项，则您将覆盖值。 尝试这个：

for ind, text in enumerate(text_sent) :
    for iterator in dict_.keys() :
        if iterator in text :
            text_sent[ind] = re.sub(iterator, dict_[iterator], text)
            break

Answer 4

只要替换的元素位于每行的末尾，这将在不使用 re 的情况下完成任务，就像您的示例中的情况一样：

for ind, text in enumerate(text_sent) :
    for iterator in dict_.keys() :
        if iterator in text :
            text_sent[ind] = text.split(iterator)[0] + dict_[iterator]

for i in text_sent:
    print(i)

#Prints:
#Ram is going to the some_text_2
#John is going to some_text_2
#shena is going to the some_text_1
#Jennifer is going to the some_text_1

如何用适当的字典值替换句子中的字符串？

问题描述

4 个解决方案

解决方案1
3 已采纳 2021-02-01 04:12:14

解决方案2
1 2021-02-01 04:26:54

解决方案3
0 2021-02-01 04:19:54

解决方案4
0 2021-02-01 04:26:42

如何用适当的字典值替换句子中的字符串？

问题描述

4 个解决方案

解决方案1 3 已采纳 2021-02-01 04:12:14

解决方案2 1 2021-02-01 04:26:54

解决方案3 0 2021-02-01 04:19:54

解决方案4 0 2021-02-01 04:26:42

解决方案1
3 已采纳 2021-02-01 04:12:14

解决方案2
1 2021-02-01 04:26:54

解决方案3
0 2021-02-01 04:19:54

解决方案4
0 2021-02-01 04:26:42