python 3 regex - 在字符串中查找所有重叠匹配的开始和结束索引

Question

This was my original approach: 这是我原来的方法：

string = '1'*15     
result = re.finditer(r'(?=11111)', string)      # overlapped = True   
                                                # Doesn't work for me 
for i in result:                                # python 3.5
   print(i.start(), i.end())

It finds all overlapping matches, but fails to get the right end index. 它找到所有重叠匹配，但无法获得正确的结束索引。 The output: 输出：

1 <_sre.SRE_Match object; span=(0, 0), match=''>
2 <_sre.SRE_Match object; span=(1, 1), match=''>
3 <_sre.SRE_Match object; span=(2, 2), match=''>
4 <_sre.SRE_Match object; span=(3, 3), match=''>
(and so on..)

My Question: How can I find all overlapping matches, and get all the start and end index right as well? 我的问题：如何找到所有重叠的匹配，并获得所有开始和结束索引？

Answer 1

The problem you get is related to the fact that a lookahead is a zero-width assertion that consumes (ie adds to the match result) no text. 您得到的问题与前瞻是零宽度断言的事实有关，该断言消耗（即添加匹配结果）没有文本。 It is a mere position in the string. 它只是字符串中的一个位置。 Thus, all your matches start and end at the same location in the string. 因此，所有匹配都在字符串中的相同位置开始和结束。

You need to enclose the lookahead pattern with a capturing group (ie (?=(11111)) ) and access start and end of group 1 (with i.start(1) and i.end(1) ): 您需要将前瞻模式与捕获组 （即(?=(11111)) ）和访问组1的开始和结束（使用i.start(1)和i.end(1) ） i.end(1) ：

import re
s = '1'*15     
result = re.finditer(r'(?=(11111))', s)

for i in result:
    print(i.start(1), i.end(1))

See the Python demo , its output is 查看Python演示，其输出是

(0, 5)
(1, 6)
(2, 7)
(3, 8)
(4, 9)
(5, 10)
(6, 11)
(7, 12)
(8, 13)
(9, 14)
(10, 15)

Answer 2

Can you compare with this implementation and see where the differences might be. 你能否与这个实现进行比较，看看差异可能在哪里。

match = re.finditer(r'111','test111 end111 and another 111')
for i in match:
    print(i.start(),i.end()

If this is not working for you kindly share a sample of you data 如果这不适合您，请分享您的数据样本

python 3 regex - 在字符串中查找所有重叠匹配的开始和结束索引

问题描述

2 个解决方案

解决方案1
4 已采纳 2017-03-31 20:24:06

解决方案2
1 2017-03-31 20:32:35

python 3 regex - 在字符串中查找所有重叠匹配的开始和结束索引

问题描述

2 个解决方案

解决方案1 4 已采纳 2017-03-31 20:24:06

解决方案2 1 2017-03-31 20:32:35

解决方案1
4 已采纳 2017-03-31 20:24:06

解决方案2
1 2017-03-31 20:32:35