简体   繁体   English

Python regex - 去掉开头和结尾并保留中间部分不变

[英]Python regex - strip out beginning and end and leave middle untouched

I need to strip out (ie, substitute with nothing) everything in a series of filenames on either side of a numeral in the middle.我需要去掉(即,不替换)中间数字两侧的一系列文件名中的所有内容。 I can do it in two steps, but I would like to do it in one.我可以分两步完成,但我想一步完成。

Two steps:两个步骤:

filename = "NRC_401653_XL3213456321_NRCE_KR.pdf"

front_gone = re.sub(r'(\w{3})_(\d{6})_', '', filename)

both_gone = re.sub(r'_NRCE_KR.pdf', '', front_gone)

This will result in just XL3213456321 remaining, which is what I need.这将导致只剩下XL3213456321 ,这正是我所需要的。 I would like to do this in one step.我想一步完成。

Try:尝试:

import re
filename = "NRC_401653_XL3213456321_NRCE_KR.pdf"
print re.sub(r"\w{3}_\d+_(\w+)_NRCE_KR\.pdf", r"\1", filename)

Output:输出:

XL3213456321

(\\w+) will extract a matching group (number 1). (\\w+)将提取匹配组(编号 1)。 Then you want to replace the whole string into just the middle code so that's why you just need to pass \\1 as the replacement.然后您想将整个字符串替换为中间代码,这就是为什么您只需要传递\\1作为替换。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM