[英]Python Regex: match any repeated words that are separated by exactly one other word
I encountered this problem where I need to use regex to find repeated words separated by another word.我遇到了这个问题,我需要使用正则表达式来查找由另一个单词分隔的重复单词。
So if:因此,如果:
"all in all"
will return: "all"
"all in all"
将返回: "all"
"good good good"
will return: Null
(Same word not another word) "good good good"
将返回: Null
(同一个词不是另一个词)
I have tried:我试过了:
p = re.compile(r'(\b\w+\b)\s\w+\s\1')
m = p.findall('all in all day in and day out bit by bit good good good')
print(m)
This returns ['all', 'bit', 'good']
, but I only want it to return ['all','bit']
.这将返回
['all', 'bit', 'good']
,但我只希望它返回['all','bit']
。
Thanks in advance!提前致谢!
You just need to add a negative lookahead for the word immediately following the initial capture group to ensure your regex can't match (for example) good good
:您只需要在初始捕获组之后立即为单词添加否定前瞻,以确保您的正则表达式无法匹配(例如)
good good
:
import re
p = re.compile(r'(\b\w+\b)(?!\s\1\b)\s\w+\s\1\b')
m = p.findall('all in all day in and day out bit by bit good good good')
print(m)
Output:输出:
['all', 'bit']
If you want to include overlapping matches, make the entire regex a positive lookahead (thanks @ggorlen):如果要包含重叠匹配项,请将整个正则表达式设为正向预测(感谢 @ggorlen):
p = re.compile(r'(?=(\b\w+\b)(?!\s\1\b)\s\w+\s\1\b)')
m = p.findall('foo bar foo bar foo')
Output:输出:
['foo', 'bar', 'foo']
If you also need to remove duplicate matches, convert to a set
and back to a list
:如果您还需要删除重复的匹配项,请转换为
set
并返回list
:
p = re.compile(r'(?=(\b\w+\b)(?!\s\1\b)\s\w+\s\1\b)')
m = list(set(p.findall('foo bar foo bar foo')))
print(m)
Output:输出:
['foo', 'bar']
No need for regex;不需要正则表达式; normal programming constructs can handle this sort of problem just fine.
正常的编程结构可以很好地处理此类问题。 Write a loop and add a conditional:
编写一个循环并添加一个条件:
s = 'all in all day in and day out bit by bit good good good'
words = s.split()
result = []
for i in range(len(words) - 2):
if words[i] == words[i+2] and words[i] != words[i+1]:
result.append(words[i])
print(result) # ['all', 'bit']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.