关于python Regex中重复模式的困惑

Question

I have confusion about repeating pattern in Python regular expression. 我对在Python正则表达式中重复模式感到困惑。 I read from the documentation that '*' means repeating zero to N times. 我从文档中了解到，“ *”表示重复零至N次。 Suppose I have a string abc123def . 假设我有一个字符串abc123def 。 I want to find the position of the substring containing numeric characters, so I use the following code: 我想找到包含数字字符的子字符串的位置，因此我使用以下代码：

p = re.compile(r'[\d]*')
p.search('abc123def').span()

And it outputs (0,0) If I change the regex to [\\d]+ , it outputs (3,6) . 然后输出(0,0)如果我将正则表达式更改为[\\d]+ ，则输出(3,6) 。

Why the regex r'[\\d]*' doesn't work? 为什么正则表达式r'[\\d]*'不起作用？ Thanks. 谢谢。

Answer 1

It does work. 确实有效。 [\\d]* (BTW, brackets are unnecessary - \\d* will do exactly the same) matches any sequence of digits, including 0 digits ie. [\\d]* （顺便说一句，括号是不必要的- \\d*作用完全相同）匹配任何数字序列， 包括0位数字，即。 an empty string . 空字符串 。 And empty string is matched anywhere, in particular at the beginning of the string. 空字符串将在任何地方匹配，尤其是在字符串的开头。 If you want a non-empty sequence of digits, use \\d+ like you already did. 如果您想要一个非空的数字序列，请像以前一样使用\\d+ 。

Answer 2

它确实起作用，它在字符串的开头找到了一个零长度的字符串。

Answer 3

Another way to see what is happening is to use findall : 查看正在发生的另一种方法是使用findall ：

>>> re.findall(r'\d*', 'abc123def')
['', '', '', '123', '', '', '', '']

vs VS

>>> re.findall(r'\d+', 'abc123def')
['123']

Or visually with regex101 或视觉上使用regex101

The * means 'zero or more' at the first opportunity. *表示第一个机会为“零或更多”。 You have zero digits at the start of the string. 字符串开头的数字为零。 A match! 火柴！ And that matches are every character in the string. 并且匹配项是字符串中的每个字符。

Use + if you want to match a substring. 如果要匹配子字符串，请使用+ 。

关于python Regex中重复模式的困惑

问题描述

3 个解决方案

解决方案1
2 已采纳 2017-07-06 14:38:15

解决方案2
1 2017-07-06 14:37:44

解决方案3
1 2017-07-06 14:43:16

关于python Regex中重复模式的困惑

问题描述

3 个解决方案

解决方案1 2 已采纳 2017-07-06 14:38:15

解决方案2 1 2017-07-06 14:37:44

解决方案3 1 2017-07-06 14:43:16

解决方案1
2 已采纳 2017-07-06 14:38:15

解决方案2
1 2017-07-06 14:37:44

解决方案3
1 2017-07-06 14:43:16