简体   繁体   中英

Exclude the middle of a capture group regex

I have a string:

2km739

and I am trying to use a regex to capture the 2739

I know I could just use two capture groups and combine them after (EDIT: or extract the numerical chars after I capture the group), but this would be a little easier in this situation and I am curious if this is possible.

I have this:

([0-9](?=[km])(?<=[km])\d+)

but it doesn't work

it only works if I add the km in there somewhere

([0-9](?=[km])km(?<=[km])\d+)

I would also think this would work, but I learned non-capture groups still get capture but the outside group

([0-9](?:km)\d+)

If you want to remove all of the letters and capture only digits, you can change the capture group to do that.

(\\d+)

You'll need to merge all of the captured groups at the end, as you can't skip over pieces of the input without closing the capture group.

In you regex you use [km] which is the notation for a character class and will match k or m .

Maybe it is an option to capture the groups in a positive lookahead and then join them:

^(?=(\\d)km(\\d+))

str = "2km739"
reobj = re.compile(r"^(?=(\d)km(\d+))")
match = reobj.search(str)
print ''.join(match.groups())

Demo

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM