How to replace only part of the match with python re.sub

Question

I need to match two cases by one reg expression and do replacement

'long.file.name.jpg' -> 'long.file.name_ suff .jpg'

'long.file.name_ a .jpg' -> 'long.file.name_ suff .jpg'

I'm trying to do the following

re.sub('(\_a)?\.[^\.]*$' , '_suff.',"long.file.name.jpg")

But this is cut the extension '.jpg' and I'm getting

long.file.name_suff. instead of long.file.name_suff.jpg I understand that this is because of [^.]*$ part, but I can't exclude it, because I have to find last occurance of '_a' to replace or last '.'

Is there a way to replace only part of the match?

Answer 1

在要保留的部分周围放置一个捕获组，然后在替换文本中包含对该捕获组的引用。

re.sub(r'(\_a)?\.([^\.]*)$' , r'_suff.\2',"long.file.name.jpg")

Answer 2

 re.sub(r'(?:_a)?\.([^.]*)$', r'_suff.\1', "long.file.name.jpg")

?: starts a non matching group ( SO answer ), so (?:_a) is matching the _a but not enumerating it, the following question mark makes it optional.

So in English, this says, match the ending .<anything> that follows (or doesn't) the pattern _a

Another way to do this would be to use a lookbehind ( see here ). Mentioning this because they're super useful, but I didn't know of them for 15 years of doing REs

Answer 3

Just put the expression for the extension into a group, capture it and reference the match in the replacement:

re.sub(r'(?:_a)?(\.[^\.]*)$' , r'_suff\1',"long.file.name.jpg")

Additionally, using the non-capturing group (?:…) will prevent re to store to much unneeded information.

Answer 4

You can do it by excluding the parts from replacing. I mean, you can say to the regex module; "match with this pattern, but replace a piece of it".

re.sub(r'(?<=long.file.name)(\_a)?(?=\.([^\.]*)$)' , r'_suff',"long.file.name.jpg")
>>> 'long.file.name_suff.jpg'

long.file.name and .jpg parts are being used on matching, but they are excluding from replacing.

Answer 5

I wanted to use capture groups to replace a specific part of a string to help me parse it later. Consider the example below:

s= '<td> <address> 110 SOLANA ROAD, SUITE 102<br>PONTE VEDRA BEACH, FL32082 </address> </td>'

re.sub(r'(<address>\s.*?)(<br>)(.*?\<\/address>)', r'\1 -- \3', s)
##'<td> <address> 110 SOLANA ROAD, SUITE 102 -- PONTE VEDRA BEACH, FL32082 </address> </td>'

Answer 6

print(re.sub('name(_a)?','name_suff','long.file.name_a.jpg'))
# long.file.name_suff.jpg

print(re.sub('name(_a)?','name_suff','long.file.name.jpg'))
# long.file.name_suff.jpg

How to replace only part of the match with python re.sub

Question

6 answers

solution1
122 2010-05-04 08:15:15

solution2
45 ACCPTED 2010-05-04 08:17:02

solution3
10 2010-05-04 08:16:41

solution4
8 2015-06-11 14:16:54

solution5
0 2022-05-16 12:53:21

solution6
-1 2021-10-03 06:31:46

How to replace only part of the match with python re.sub

Question

6 answers

solution1 122 2010-05-04 08:15:15

solution2 45 ACCPTED 2010-05-04 08:17:02

solution3 10 2010-05-04 08:16:41

solution4 8 2015-06-11 14:16:54

solution5 0 2022-05-16 12:53:21

solution6 -1 2021-10-03 06:31:46

solution1
122 2010-05-04 08:15:15

solution2
45 ACCPTED 2010-05-04 08:17:02

solution3
10 2010-05-04 08:16:41

solution4
8 2015-06-11 14:16:54

solution5
0 2022-05-16 12:53:21

solution6
-1 2021-10-03 06:31:46