re.search() 和 re.findall() 的区别

Question

下面这段代码很奇怪：

 >>> words = "4324324 blahblah"
 >>> print re.findall(r'(\s)\w+', words)
 [' ']
 >>> print re.search(r'(\s)\w+', words).group()
 blahblah

()运算符似乎在 findall 中表现不佳。 为什么是这样？ 我需要一个 csv 文件。

为清楚起见进行编辑：我想使用 findall 显示blahblah 。

我发现re.findall(r'\s(\w+)', words)做我想做的，但不知道为什么 findall 以这种方式对待组。

Answer 1

少一个字：

>>> print re.search(r'(\s)\w+', words).groups()
(' ',)
>>> print re.search(r'(\s)\w+', words).group(1)
' '

findall返回捕获的所有组的列表。 您将获得空间，因为那是您捕获的内容。 停止捕获，它工作正常：

>>> print re.findall(r'\s\w+', words)
[' blahblah']

使用csv模块

Answer 2

如果您更喜欢将捕获组保留在您的正则表达式中，但您仍然希望找到每个匹配项的全部内容而不是组，您可以使用以下命令：

[m.group() for m in re.finditer(r'(\s)\w+', words)]

例如：

>>> [m.group() for m in re.finditer(r'(\s)\w+', '4324324 blahblah')]
[' blahblah']