用字典键/值替换占位符

Question

我有带有占位符的文本，例如：

sometext $plc_hldr1 some more text $plc_hldr2 some more more text $1234date_placeholder some text $5678date_placeholder

然后我有字典，其中键代表占位符，值是占位符应替换为的值：

placeholders = {'$plc_hldr1': '1111',
                '$plc_hldr2': 'abcd'}

我找到并调整了 function 来处理替换：

def multiple_replace(adict, text):
   # Create a regular expression from all of the dictionary keys
    regex = re.compile("|".join(map(re.escape, adict.keys(  ))))

   # For each match, look up the corresponding value in the dictionary
   return regex.sub(lambda match: adict[match.group(0)], text)

function 正在为$plc_hldr1和$plc_hldr2 。

但是有$1234date_placeholder和$5678date_placeholder - 两者都应该替换为一个预定义的值。 在这种情况下date_placeholder保持不变，但数字部分总是不同的。

我想出的是：

def multiple_replace(adict, text):
   # Create a regular expression from all of the dictionary keys
    regex = re.compile("|".join(map(re.escape, adict.keys(  ))))
    regex = re.sub("\$\d*date_placeholder", "20200101", txt)
   # For each match, look up the corresponding value in the dictionary
   return regex.sub(lambda match: adict[match.group(0)], text)

但是有没有更优雅的方法呢？ 如果我有更多具有可变数字部分的占位符，应该用相同的值替换（例如 $1234dname_placeholder、$1234age_placeholder）？

Answer 1

如果不需要转义占位符的 rest，则可以将\$\d*date_placeholder与占位符的 rest 结合使用。 然后，创建没有任何特殊正则表达式字符的第二个字典，用于查找替换正则表达式匹配的内容。

map(re.escape, adict.keys())在您的代码中是必需的，因为您在占位符名称中有特殊的正则表达式字符$ 。 我建议您自己添加特殊字符转义，并将您的\$\d*date_placeholder查找添加为placeholders中的键/值对。 这消除了re.escape键的需要以及在multiple_replace function 中使用第二个替换的需要。

像这样...

import re

placeholders = {r'\$plc_hldr1': '1111',
                r'\$plc_hldr2': 'abcd',
                r'\$\d*date_placeholder': '20200101'}

def remove_escape_chars(reggie):
    return re.sub(r'\\\$\\d\*|\$\d*|\\', '', reggie)

def multiple_replace(escape_dict, text):
   # Create a second dictionary to lookup regex match replacement targets
   unescaped_placeholders = { remove_escape_chars(k): placeholders[k] for k in placeholders }

   # Create a regular expression from all of the dictionary keys
   regex = re.compile("|".join(escape_dict.keys()))
   return regex.sub(lambda match: unescaped_placeholders[remove_escape_chars(match.group(0))], text)

text = "sometext $plc_hldr1 some more text $plc_hldr2 some more more text $1234date_placeholder some text $5678date_placeholder"

result = multiple_replace(placeholders, text)
print(result)

这种方法的缺点是，如果您将新模式引入占位符，则必须更新remove_escape_chars(...) function 中的正则表达式。 （它将扩展到类似的模式，例如$1234dname_placeholder或$1234age_placeholder 。）

用字典键/值替换占位符

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-03-02 18:26:49

用字典键/值替换占位符

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-03-02 18:26:49

解决方案1
1 已采纳 2021-03-02 18:26:49