What am I doing wrong with this Python regex that is supposed to match repeats of a pattern, followed by an optional pattern?

Question

Here is what I am trying:

import re

r = re.compile(r'(?P<label>(?:[^_]+)+)(_r(?P<repeat_num>\d+))?')

def main():
    s1 = 'abc_123'
    s2 = 'abc_123_r1'

    m1 = r.match(s1)
    m2 = r.match(s2)

    print(m1.groups())
    print(m2.groups())

if __name__ == "__main__":
    main()

I am expecting the first string s1 to match abc_123 for the label group and nothing for repeat_num .

And I am expecting the second string s2 to match abc_123 for the label group and '1' for repeat_num .

The actual result stops at abc in both cases.

Answer 1

It looks like it's partially due to the [^_] bit, which matches "any character except underscore".

I couldn't immediately figure out a solution that would properly capture these tokens; I highly recommend using RegExr to play with your regular expression in order to figure out how to match the pieces correctly.

Answer 2

Your pattern is not matching the _ between the abc and 123 pieces of your input strings. You need to modify your first capturing group in order to be able to handle those.

A direct translation though may run into difficulties, because it's a bit difficult to distinguish the last _r1 block from a normal extra block like _123 . I think the pattern below does it correctly, but you should double check that it always does what you expect:

(?P<label>[^_]+(?:_[^_]+)*?)(?:_r(?P<repeat_num>\d+))?

If you always require at least two underlined separated groups in the first part of the text (eg abc_123 , but never just abc or 123 by itself), you should replace the *? with +? .

What am I doing wrong with this Python regex that is supposed to match repeats of a pattern, followed by an optional pattern?

Question

2 answers

solution1
0 2018-02-07 23:09:53

solution2
0 2018-02-07 23:15:33

What am I doing wrong with this Python regex that is supposed to match repeats of a pattern, followed by an optional pattern?

Question

2 answers

solution1 0 2018-02-07 23:09:53

solution2 0 2018-02-07 23:15:33

solution1
0 2018-02-07 23:09:53

solution2
0 2018-02-07 23:15:33