简体   繁体   中英

Python how to replace content in the capture group of regex?

-abc1234567-abc.jpg

I wish to remove -abc before .jpg , and get -abc1234567.jpg . I tried re.sub(r'\\d(-abc).jpg$', '', string) , but it will also replace contents outside of the capture group, and give me -abc123456 . Is it possible to only replace the content in the capture group ie '-abc'?

One solution is to use positive lookahead as follows.

import re
p = re.compile(ur'(\-abc)(?=\.jpg)')
test_str = u"-abc1234567-abc.jpg"
subst = u""

result = re.sub(p, subst, test_str)  

OR

You can use two capture groups as follows.

import re
p = re.compile(ur'(\-abc)(\.jpg)')
test_str = u"-abc1234567-abc.jpg"
subst = r"\2"

result = re.sub(p, subst, test_str)

If you only want to remove -abc in only jpg files, you could use:

re.sub(r"-abc\.jpg$", ".jpg", string)

To use your code as close as possible: you should place '()' around the part you want to keep, not the part you want to remove. Then use \\g<NUMBER> to select that part of the string. So:

re.sub(r'(.*)-abc(\.jpg)$', '\g<1>\g<2>', string)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM