使用Lookahead的Python正则表达式findall

Question

I'm new to regular expressions and would like to understand how findall() and lookahead can be used to find all occurrences of a given pattern within a string. 我是正则表达式的新手，想了解如何使用findall（）和lookahead查找字符串中给定模式的所有出现情况。 I am having problems with alternating characters. 我在替换字符时遇到问题。 Here is an example of what I want: 这是我想要的示例：

s = 'ababa4abaab'
p = 'aba'
print([ s[i:i+len(p)] for i in range(len(s)) if s[i:i+len(p)]==p])
['aba', 'aba', 'aba']

Here is my attempt with findall(): 这是我对findall（）的尝试：

import re
re.findall('aba', 'ababa4abaab')
['aba', 'aba']

It only returns 2 matches but I want all three. 它只返回2个匹配项，但我希望所有三个匹配项。 I read this tutorial but did not quite understand. 我阅读了本教程，但不太了解。 I tried 我试过了

re.findall('(?=aba)', 'ababa4abaab')
['', '', '']

Can someone please tell me how to use this lookahead concept in this case and provide a brief explanation of how it works? 有人可以告诉我在这种情况下如何使用该超前概念，并简要说明其工作原理吗？

Answer 1

I think you just need to search either there is an 'ab' and 'a' right after, You don't need to catch it as 'aba', you can use this look ahead: 我认为您只需要在其中搜索“ ab”和“ a”之后，就不必将其捕获为“ aba”，您可以使用以下代码：

ab(?=a)

which gives you 3 matches. 这将给您3个匹配项。

you can also capture it inside a group and then iterate each one of them and concatenate 'a' so you'll end with the desired text 'aba' for each match 您还可以将其捕获到一个组中，然后对其进行迭代并连接“ a”，以便为每个匹配项以所需的文本“ aba”结尾

 (ab(?=a))

Answer 2

Official doc about findall says it 关于findall的官方文档说

"Return a list of all non-overlapping matches in the string." “返回字符串中所有不重叠匹配项的列表。”

使用Lookahead的Python正则表达式findall

问题描述

2 个解决方案

解决方案1
0 2017-12-27 08:40:28

解决方案2
0 2017-12-27 09:29:45

使用Lookahead的Python正则表达式findall

问题描述

2 个解决方案

解决方案1 0 2017-12-27 08:40:28

解决方案2 0 2017-12-27 09:29:45

解决方案1
0 2017-12-27 08:40:28

解决方案2
0 2017-12-27 09:29:45