简体   繁体   中英

Python regex - strip out beginning and end and leave middle untouched

I need to strip out (ie, substitute with nothing) everything in a series of filenames on either side of a numeral in the middle. I can do it in two steps, but I would like to do it in one.

Two steps:

filename = "NRC_401653_XL3213456321_NRCE_KR.pdf"

front_gone = re.sub(r'(\w{3})_(\d{6})_', '', filename)

both_gone = re.sub(r'_NRCE_KR.pdf', '', front_gone)

This will result in just XL3213456321 remaining, which is what I need. I would like to do this in one step.

Try:

import re
filename = "NRC_401653_XL3213456321_NRCE_KR.pdf"
print re.sub(r"\w{3}_\d+_(\w+)_NRCE_KR\.pdf", r"\1", filename)

Output:

XL3213456321

(\\w+) will extract a matching group (number 1). Then you want to replace the whole string into just the middle code so that's why you just need to pass \\1 as the replacement.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM