简体   繁体   English

使用正则表达式以任意顺序匹配两个单词

[英]Match two word in arbitrary order using regex

I have spent some time learning Regular Expression, but I still don't understand how the following trick works to match two words in different order . 我花了一些时间学习正则表达式,但我仍然不明白以下技巧如何以不同的顺序匹配两个单词

import re
reobj = re.compile(r'^(?=.*?(John))(?=.*?(Peter)).*$',re.MULTILINE)

string = '''
John and Peter
Peter and John
James and Peter and John
'''
re.findall(reobj,string)

result 结果

[('John', 'Peter'), ('John', 'Peter'), ('John', 'Peter')]

在此输入图像描述

( https://www.regex101.com/r/qW4rF4/1 ) https://www.regex101.com/r/qW4rF4/1

I know the (?=.* ) part is called Positive Lookahead , but how does it work in this situation? 我知道(?=.* )部分被称为Positive Lookahead ,但在这种情况下它是如何工作的?

Any explanation? 任何解释?

It just does not match in any arbitrary order.Capturing here is being done by .* which consumes anything which comes its way.The positive lookahead makes an assertion .You have two lookaheads .They are independent of each other.Each makes an assertion one word.So finally your regex works like: 它只是在任意顺序中都不匹配。在这里抓住它正在消耗任何东西.* positive lookahead产生一个断言。你有两个lookaheads 。它们是相互独立的。每个都断言一个所以最后你的正则表达式如下:

1) (?=.*?(John)) ===String should have a John .Just an assertion.Does not consume anything 1) (?=.*?(John)) ===字符串应该有一个John 。只是一个断言。不消耗任何东西

2) (?=.*?(Peter)) ===String should have a Peter .Just an assertion.Does not consume anything 2) (?=.*?(Peter)) ===字符串应该有一个Peter 。只是一个断言。不消耗任何东西

3) .* ===Consume anything if assertions have passed 3) .* ===如果断言已经通过则消耗任何东西

So you see the order does not matter here.,what is imp is that assertions should pass . 所以你看到这里的顺序并不重要。那就是assertions should pass是什么。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM