Confusion about repeating pattern in python Regex

Question

I have confusion about repeating pattern in Python regular expression. I read from the documentation that '*' means repeating zero to N times. Suppose I have a string abc123def . I want to find the position of the substring containing numeric characters, so I use the following code:

p = re.compile(r'[\d]*')
p.search('abc123def').span()

And it outputs (0,0) If I change the regex to [\\d]+ , it outputs (3,6) .

Why the regex r'[\\d]*' doesn't work? Thanks.

Answer 1

It does work. [\\d]* (BTW, brackets are unnecessary - \\d* will do exactly the same) matches any sequence of digits, including 0 digits ie. an empty string . And empty string is matched anywhere, in particular at the beginning of the string. If you want a non-empty sequence of digits, use \\d+ like you already did.

Answer 2

它确实起作用，它在字符串的开头找到了一个零长度的字符串。

Answer 3

Another way to see what is happening is to use findall :

>>> re.findall(r'\d*', 'abc123def')
['', '', '', '123', '', '', '', '']

vs

>>> re.findall(r'\d+', 'abc123def')
['123']

Or visually with regex101

The * means 'zero or more' at the first opportunity. You have zero digits at the start of the string. A match! And that matches are every character in the string.

Use + if you want to match a substring.

Confusion about repeating pattern in python Regex

Question

3 answers

solution1
2 ACCPTED 2017-07-06 14:38:15

solution2
1 2017-07-06 14:37:44

solution3
1 2017-07-06 14:43:16

Confusion about repeating pattern in python Regex

Question

3 answers

solution1 2 ACCPTED 2017-07-06 14:38:15

solution2 1 2017-07-06 14:37:44

solution3 1 2017-07-06 14:43:16

solution1
2 ACCPTED 2017-07-06 14:38:15

solution2
1 2017-07-06 14:37:44

solution3
1 2017-07-06 14:43:16