[英]not matching a set number of characters in regex
I have the following expression 我有以下表达式
diff_pr_EUR-44_cordex_rcp45_mon_ave_2048-2060_minus_2005-2017_mon10_ave1_withsd.nc
I would like to use regex
to extract and generate the following string 我想使用regex
来提取并生成以下字符串
rcp45_mon10
I have tried this so far with the online regex tester 到目前为止,我已经尝试使用在线正则表达式测试程序
rcp\d\d[^.]+mon\d+
Which extracts more than what I need... 哪个提取超过我需要的...
rcp45_mon_ave_2048-2060_minus_2005-2017_mon10
How can I get regex to skip subsequent characters until it reaches the mon10
part? 如何让正则表达式跳过后续字符直到它到达mon10
部分?
Thanks 谢谢
You can match using two matching groups, and join
: 您可以使用两个匹配的组进行匹配,然后join
:
>>> ''.join(re.findall(r'(rcp\d{2}).*?(\_mon\d{2})', s)[0])
'rcp45_mon10'
You may use re.sub
here: 你可以在这里使用re.sub
:
>>> s = 'diff_pr_EUR-44_cordex_rcp45_mon_ave_2048-2060_minus_2005-2017_mon10_ave1_withsd.nc'
>>> print (re.sub(r'^.*?(rcp\d+).*(_mon\d+).*', r'\1\2', s))
rcp45_mon10
Details: 细节:
^.*?
: Match 0 or of any characters at the start (lazy) :匹配0或开头的任何字符(懒惰) (rcp\\d+)
: Match and capture rcp
followed by 1+ digits in group #1 (rcp\\d+)
:匹配并捕获rcp
后跟组#1中的1+位 .*
: Match 0 or of any characters (greedy) .*
:匹配0或任何字符(贪婪) (_mon\\d+)
: Match and capture _mon
followed by 1+ digits in group #2 (_mon\\d+)
:匹配并捕获_mon
后跟组#2中的1+位数 .*
: Match anything till the end .*
:匹配任何东西直到最后
r'\\1\\2'
: Replace string by back-references of group #1 and group #2 r'\\1\\2'
:通过组#1和组#2的反向引用替换字符串
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.