Exclude the middle of a capture group regex

Question

I have a string:

2km739

and I am trying to use a regex to capture the 2739

I know I could just use two capture groups and combine them after (EDIT: or extract the numerical chars after I capture the group), but this would be a little easier in this situation and I am curious if this is possible.

I have this:

([0-9](?=[km])(?<=[km])\d+)

but it doesn't work

it only works if I add the km in there somewhere

([0-9](?=[km])km(?<=[km])\d+)

I would also think this would work, but I learned non-capture groups still get capture but the outside group

([0-9](?:km)\d+)

Answer 1

If you want to remove all of the letters and capture only digits, you can change the capture group to do that.

(\\d+)

You'll need to merge all of the captured groups at the end, as you can't skip over pieces of the input without closing the capture group.

Answer 2

In you regex you use [km] which is the notation for a character class and will match k or m .

Maybe it is an option to capture the groups in a positive lookahead and then join them:

^(?=(\\d)km(\\d+))

str = "2km739"
reobj = re.compile(r"^(?=(\d)km(\d+))")
match = reobj.search(str)
print ''.join(match.groups())

Demo

Exclude the middle of a capture group regex

Question

2 answers

solution1
0 2018-03-30 21:25:23

solution2
0 ACCPTED 2018-03-31 11:07:47

Exclude the middle of a capture group regex

Question

2 answers

solution1 0 2018-03-30 21:25:23

solution2 0 ACCPTED 2018-03-31 11:07:47

solution1
0 2018-03-30 21:25:23

solution2
0 ACCPTED 2018-03-31 11:07:47