简体   繁体   中英

Python: Replacing multiple specific words from a list with re.sub

I have the following string and list 'changewords'. I would like to replace the '{word from list} \\n' with '{word from list}:' I don't want to replace all instances of '\\n'.

string = "Foo \n value of something \n Bar \n Another value \n"
changewords = ["Foo", "Bar"]

Desired Output:

'Foo: value of something \n Bar: Another value \n'

I have tried the following

for i in changewords:
    tem = re.sub(f'{i} \n', f'{i}:', string)
tem
Output: 'Foo \n value of something \n Bar: Another value \n'

and

changewords2 = '|'.join(changewords)
tem = re.sub(f'{changewords2} \n', f'{changewords2}:', string)
tem
Output: 'Foo|Bar: \n value of something \n Foo|Bar: Another value \n'

How can I get my desired output?

You may use this code:

import re

string = "Foo \n value of something \n Bar \n Another value \n"
changewords = ["foo", "Bar"]

tem = string
for i in changewords:
    tem = re.sub(f'(?i){i} \n', f'{i}:', tem)
print( tem )

Output:

foo: value of something
 Bar: Another value

Note tem = string to initialize tem value and then inside the for loop use re.sub on tem with assigning back result to tem itself.

Use of (?i) is for ignore case matching.

Code Demo

Using replacement string:

A slightly more elegant way of doing it. This one-liner:

re.sub(rf"({'|'.join(changewords)}) \n", r"\1:", string, flags=re.I)

demo:

>>> string = "Foo \n value of something \n Bar \n Another value \n"
>>> changewords = ['Foo', 'Bar', 'Baz', 'qux']
>>> 
>>> re.sub(rf"({'|'.join(changewords)}) \n", r"\1:", string, flags=re.I)
'Foo: value of something \n Bar: Another value \n'
>>> 

You can specify case insensitive matching with the flags option. And the replacement string can be modified to have anything around \\1 needed like colons or commas.

Worth noting, you can put more than one specifier on strings in Python. For instance you can have both r and f like, rf"my raw formatted string" - the order of specifiers isn't important.

Within the expression in re.sub(expr, repl, string) , you can specify groups. Groups are made by placing parenthesis () around text.

Groups can then be referenced in the replacement string, repl , by using a backslash and the number of its occurrence - the first group is referred to by \\1 .

The re.sub() function, re.sub(rf"(A|B|C) \\n", r"\\1: ") , associates \\1 within the replacement string with the first group (A|B|C) within the expression argument.

Using replacement function:

Suppose you want to replace words in the target string with other words from a dictionary. For instance you want 'Bar' to be replaced with 'Hank' and 'Foo' with 'Bernard'. This can be done using a replacement function instead of replacement string:

>>> repl_dict = {'Foo':'Bernard', 'Bar':'Hank'}
>>> 
>>> expr = rf"({'|'.join(repl_dict.keys())}) \n"   # Becomes '(Foo|Bar) \\n'
>>>
>>> func = lambda mo: f"{repl_dict[mo.group(1)]}:"
>>> 
>>> re.sub(expr, func, string, flags=re.I)
'Bernard: value of something \n Hank: Another value \n'
>>> 

This could be another one-liner, but I broke it up for clarity...

What the lambda function does is take the match object, mo passed to it, then extract the first group's text. The first group in the reg expr is the text encompassed by () , which would be like (A|B|C) .

The replacement function references this first group using, mo.group(1) ; similarly, the replacement string referenced it by, \\1 in the previous example.

Then the repl function does the lookup in the dict and returns the final replacement string for the match.

You could very well not use regex at all, and my first approach would be to use the builtin string function .replace() , to make it look something like:

string = "Foo \n value of something \n Bar \n Another value \n"
changewords = ["Foo", "Bar"]

for word in changewords:
   to_replace = "{0} \n".format(word)
   replacement = "{0}:".format(word)
   string = string.replace(to_replace, replacement)

Hope it helps!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM