简体   繁体   中英

How to replace multiple words of a string using regex in python?

I have a dictionary like:

dic = { "xl": "xlarg", "l": "larg",'m':'medium'}

and I'd like to use re.sub or similar methods find any string (including a single letter) which are in dic.keys and replace it with the key's value.

def multiple_replace(dict, text):
     # Create a regular expression  from the dictionary keys
     regex = re.compile("(%s)" % "|".join(map(re.escape, dict.keys())))
    
     # For each match, look-up corresponding value in dictionary
     return regex.sub(lambda mo: dict[mo.string[mo.start():mo.end()]], text)

it works well for single letters in the string eg it changes size m to size medium but also it changes letters in words, eg changes monday to mediumonday

Thanks

You can use re.compile and the sub method to find the matching substrings and replace them. The idea here is that you join all of the keys into a single pattern by using an OR statement | . For each match you then use the matched substring to do a lookup on the replacement dict.

Along with this you can use a lookbehind and a lookahead regex. For the lookbehind, you want it to not be a word (?<!\\w) . For the lookahead, you want it to also not be word (?!\\w) .

Putting this altogether, we have: r"(?<!\\w)(xl|l|m)(?!\\w)"

Here's an example:

def replace_substrings(s, d):
    p = "|".join(d.keys())
    p = r"(?<!\w)(" + p + r")(?!\w)"
    return re.compile(p).sub(lambda m: d[m.group(0)], s)
...


dic = {"xl": "xlarg", "l": "larg",'m':'medium'}
inputs = [
    "size m",
    "monday",
    "xl sell",
    "m size m l xl",
]

for input in inputs:
    print(replace_substrings(input, dic))

This will output:

size medium
monday
xlarg sell
medium size medium larg xlarg

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM