not matching a set number of characters in regex

Question

I have the following expression

diff_pr_EUR-44_cordex_rcp45_mon_ave_2048-2060_minus_2005-2017_mon10_ave1_withsd.nc

I would like to use regex to extract and generate the following string

rcp45_mon10

I have tried this so far with the online regex tester

rcp\d\d[^.]+mon\d+

Which extracts more than what I need...

rcp45_mon_ave_2048-2060_minus_2005-2017_mon10

How can I get regex to skip subsequent characters until it reaches the mon10 part?

Thanks

Answer 1

You can match using two matching groups, and join :

>>> ''.join(re.findall(r'(rcp\d{2}).*?(\_mon\d{2})', s)[0])
'rcp45_mon10'

Answer 2

You may use re.sub here:

>>> s = 'diff_pr_EUR-44_cordex_rcp45_mon_ave_2048-2060_minus_2005-2017_mon10_ave1_withsd.nc'
>>> print (re.sub(r'^.*?(rcp\d+).*(_mon\d+).*', r'\1\2', s))
rcp45_mon10

RegEx Demo

Details:

^.*? : Match 0 or of any characters at the start (lazy)
(rcp\\d+) : Match and capture rcp followed by 1+ digits in group #1
.* : Match 0 or of any characters (greedy)
(_mon\\d+) : Match and capture _mon followed by 1+ digits in group #2
.* : Match anything till the end
r'\\1\\2' : Replace string by back-references of group #1 and group #2

not matching a set number of characters in regex

Question

2 answers

solution1
2 2018-08-30 15:14:01

solution2
2 ACCPTED 2018-08-30 15:15:55

not matching a set number of characters in regex

Question

2 answers

solution1 2 2018-08-30 15:14:01

solution2 2 ACCPTED 2018-08-30 15:15:55

solution1
2 2018-08-30 15:14:01

solution2
2 ACCPTED 2018-08-30 15:15:55