简体   繁体   中英

How to match the string and only modify part of it on the original sentence?

Suppose I have a string:

Here is a medicine (take it twice a day)

There's a lot of formats like (...) in my paragraph. But I only need to change one particular text at a time. So I must match the whole thing of (...) then modify the text in ().

In this sentence I need to match (take it twice a day) and then change take it twice a day to what I want.

The most important thing is that I need to make the change in the original text. I use re.sub , but it will change the whole (...). So how can I just edit the text in () on the original sentence.

re.sub() can use a function as the replacement instead of a string. The function will get the Match object as its argument, and then must return the replacement string.

For example, you can change the part in parentheses to upper case:

>>> re.sub(r"\(.*\)", lambda m: m[0].upper(), "Here is a medicine (take it twice a day)")
'Here is a medicine (TAKE IT TWICE A DAY)'

You can replace matched groups with any string you can compute from what was matched.

When there are multiple (...) 's, you might want to use .*? to match as little as possible in the pattern instead of .* , which is greedy for example,

>>> re.sub(r"\(.*\)", lambda m: m[0].upper(), "Here (is a) medicine (take it twice a day)")
'Here (IS A) MEDICINE (TAKE IT TWICE A DAY)'
>>> re.sub(r"\(.*?\)", lambda m: m[0].upper(), "Here (is a) medicine (take it twice a day)")
'Here (IS A) medicine (TAKE IT TWICE A DAY)'

Thanks. But how can I change take it twice a day to another sentence by lambda? – Sakurai-ST

Have the lambda return the string you want to substitute. (Or if it's a constant string, you can use that directly instead of the lambda.)

Examples of what you want might help.


For example, I want to change Here is a medicine (take it twice a day) to Here is a medicine (take it with water) .

You don't need a lambda for that much, since the () never change in a match. You can simply include them in the replacement string:

>>> re.sub(r"\(.*\)", "(take it with water)", "Here is a medicine (take it twice a day)")
'Here is a medicine (take it with water)'

You could also use lookahead/lookbehind assertions. They don't count as part of the match, but the part ahead or behind the match must match the assertions. It's overkill for parentheses, but may be useful in other cases.

>>> re.sub(r"(?<=\().*(?=\))", "take it with water", "Here is a medicine (take it twice a day)")
'Here is a medicine (take it with water)'

Sorry to bother you, but what should I do if I want to use lambda? I think I might use it in the future, not in this case.

You can pull matching groups out of your pattern inside the lambda, for example, this same pattern works with two types of brackets, and keeps them the same:

>>> re.sub(r"([([]).*?([\])])", lambda m: m[1]+"take it with water"+m[2], "Here is a medicine (take it twice a day)")
'Here is a medicine (take it with water)'
>>> re.sub(r"([([]).*?([\])])", lambda m: m[1]+"take it with water"+m[2], "Here is a medicine [take it twice a day]")
'Here is a medicine [take it with water]'

But again, you could do this much with assertions. You could also do this kind of thing with backreferences in a string argument, no lambda necessary:

>>> re.sub(r"([([]).*?([\])])", r"\1take it with water\2", "Here is a medicine (take it twice a day)")
'Here is a medicine (take it with water)'
>>> re.sub(r"([([]).*?([\])])", r"\1take it with water\2", "Here is a medicine [take it twice a day]")
'Here is a medicine [take it with water]'

If you want to find all occurrence

>>> import re
>>> s = "Here is a medicine (take it twice a day)"
>>> re.findall('\(.*?\)',s)
>>> ['(take it twice a day)']

>>> re.findall('\((.*?)\)',s)
>>> ['take it twice a day']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM