简体   繁体   English

python 3 regex - 在字符串中查找所有重叠匹配的开始和结束索引

[英]python 3 regex - find all overlapping matches' start and end index in a string

This was my original approach: 这是我原来的方法:

string = '1'*15     
result = re.finditer(r'(?=11111)', string)      # overlapped = True   
                                                # Doesn't work for me 
for i in result:                                # python 3.5
   print(i.start(), i.end())

It finds all overlapping matches, but fails to get the right end index. 它找到所有重叠匹配,但无法获得正确的结束索引。 The output: 输出:

1 <_sre.SRE_Match object; span=(0, 0), match=''>
2 <_sre.SRE_Match object; span=(1, 1), match=''>
3 <_sre.SRE_Match object; span=(2, 2), match=''>
4 <_sre.SRE_Match object; span=(3, 3), match=''>
(and so on..)

My Question: How can I find all overlapping matches, and get all the start and end index right as well? 我的问题:如何找到所有重叠的匹配,并获得所有开始和结束索引?

The problem you get is related to the fact that a lookahead is a zero-width assertion that consumes (ie adds to the match result) no text. 您得到的问题与前瞻是零宽度断言的事实有关,该断言消耗(即添加匹配结果)没有文本。 It is a mere position in the string. 它只是字符串中的一个位置。 Thus, all your matches start and end at the same location in the string. 因此,所有匹配都在字符串中的相同位置开始和结束。

You need to enclose the lookahead pattern with a capturing group (ie (?=(11111)) ) and access start and end of group 1 (with i.start(1) and i.end(1) ): 您需要将前瞻模式与捕获组 (即(?=(11111)) )和访问组1的开始和结束(使用i.start(1)i.end(1)i.end(1)

import re
s = '1'*15     
result = re.finditer(r'(?=(11111))', s)

for i in result:
    print(i.start(1), i.end(1))

See the Python demo , its output is 查看Python演示 ,其输出是

(0, 5)
(1, 6)
(2, 7)
(3, 8)
(4, 9)
(5, 10)
(6, 11)
(7, 12)
(8, 13)
(9, 14)
(10, 15)

Can you compare with this implementation and see where the differences might be. 你能否与这个实现进行比较,看看差异可能在哪里。

match = re.finditer(r'111','test111 end111 and another 111')
for i in match:
    print(i.start(),i.end()

If this is not working for you kindly share a sample of you data 如果这不适合您,请分享您的数据样本

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在python regex中获取可能从字符串中相同位置开始的所有重叠匹配项? - How to get all overlapping matches in python regex that may start at the same location in a string? Python-正则表达式查找字符串中的所有匹配项并替换 - Python - Regex find all matches in string and replace Python正则表达式:查找所有以“ {”开头和以“}”结尾的行 - Python Regex: find all lines that start with '{' and end with '}' 如何使用正则表达式查找所有重叠匹配项 - How to use regex to find all overlapping matches 给定一个字符串如何在python中找到所有非空白子字符串的开始和结束索引 - Given a string how to find start and end index of all non-whitespace substrings in python python使用正则表达式查找给定字符串中多个字符串的所有匹配项? - python find all matches of multiple strings in a given string using regex? 如何使用 python 正则表达式在给定字符串中查找所有完全匹配项 - How to find all the exact matches in a given string using python Regex Python 正则表达式模块即使重叠 = True 也找不到所有匹配项 - Python regex module not finding all matches even with overlapping = True 在Python中查找字符串中所有事件的开始和结束位置 - Find start and end positions of all occurrences within a string in Python python3:正则表达式,查找所有以某些字符串开头和结尾的子字符串 - python3: regex, find all substrings that starts with and end with certain string
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM