正则表达式组捕获多个匹配

Question

Quick regular expression question. 快速正则表达式问题。
I'm trying to capture multiple instances of a capture group in python (don't think it's python specific), but the subsequent captures seems to overwrite the previous. 我试图在python中捕获捕获组的多个实例（不要认为它是特定于python的），但后续捕获似乎覆盖了之前的捕获。

In this over-simplified example, I'm essentially trying to split a string: 在这个过于简化的示例中，我实际上是在尝试拆分字符串：

x = 'abcdef'
r = re.compile('(\w){6}')
m = r.match(x)
m.groups()     # = ('f',) ?!?

I want to get ('a', 'b', 'c', 'd', 'e', 'f') , but because regex overwrites subsequent captures, I get ('f',) 我想得到('a', 'b', 'c', 'd', 'e', 'f') ，但因为正则表达式会覆盖后续的捕获，我得到('f',)

Is this how regex is supposed to behave? 这是正则表达式应该如何表现？ Is there a way to do what I want without having to repeat the syntax six times? 有没有办法做我想要的，而不必重复六次语法？

Thanks in advance! 提前致谢！
Andrew 安德鲁

Answer 1

You can't use groups for this, I'm afraid. 我担心你不能使用群组。 Each group can match only once, I believe all regexes work this way. 每个组只能匹配一次，我相信所有的正则表达式都是这样的。 A possible solution is to try to use findall() or similar. 一种可能的解决方案是尝试使用findall（）或类似方法。

r=re.compile(r'\w')
r.findall(x)
# 'a', 'b', 'c', 'd', 'e', 'f'

Answer 2

The regex module can do this. 正则表达式模块可以执行此操作。

> m = regex.match('(\w){6}', "abcdef")
> m.captures(1)
['a', 'b', 'c', 'd', 'e', 'f']

Also works with named captures: 也适用于命名捕获：

> m = regex.match('(?P<letter>)\w)', "abcdef")
> m.capturesdict()
{'letter': ['a', 'b', 'c', 'd', 'e', 'f']}

The regex module is expected to replace the 're' module - it is a drop-in replacement that acts identically, except it has many more features and capabilities. 预期正则表达式模块将取代're'模块 - 它是一个直接替换模块，除了具有更多的特性和功能外，它们的行为相同。

Answer 3

To find all matches in a given string use re.findall(regex, string) . 要查找给定字符串中的所有匹配项，请使用re.findall（regex，string）。 Also, if you want to obtain every letter here, your regex should be either '(\\w){1}' or just '(\\w)' . 此外，如果你想获得这里的每一个字母，你的正则表达式应该是'(\\w){1}'或者只是'(\\w)' 。

See: 看到：

r = re.compile('(\w)')
l = re.findall(r, x)

l == ['a', 'b', 'c', 'd', 'e', 'f']

Answer 4

I suppose your question is a simplified presentation of your need. 我想你的问题是对你的需求的简化表述。

Then, I take an exemple a little more complex: 然后，我举了一个更复杂的例子：

import re

pat = re.compile('[UI][bd][ae]')

ch = 'UbaUdeIbaIbeIdaIdeUdeUdaUdeUbeIda'

print [mat.group() for mat in pat.finditer(ch)]

result 结果

['Uba', 'Ude', 'Iba', 'Ibe', 'Ida', 'Ide', 'Ude', 'Uda', 'Ude', 'Ube', 'Ida']

正则表达式组捕获多个匹配

问题描述

4 个解决方案

解决方案1
14 已采纳 2011-04-08 17:10:54

解决方案2
4 2015-06-15 15:10:07

解决方案3
2 2011-04-08 17:10:58

解决方案4
1 2011-04-08 19:12:24

正则表达式组捕获多个匹配

问题描述

4 个解决方案

解决方案1 14 已采纳 2011-04-08 17:10:54

解决方案2 4 2015-06-15 15:10:07

解决方案3 2 2011-04-08 17:10:58

解决方案4 1 2011-04-08 19:12:24

解决方案1
14 已采纳 2011-04-08 17:10:54

解决方案2
4 2015-06-15 15:10:07

解决方案3
2 2011-04-08 17:10:58

解决方案4
1 2011-04-08 19:12:24