简体   繁体   English

?:, ?! 之间的区别和?=

[英]Difference between ?:, ?! and ?=

I searched for the meaning of these expressions but couldn't understand the exact difference between them.我搜索了这些表达的含义,但无法理解它们之间的确切区别。

This is what they say:他们是这样说的:

  • ?: Match expression but do not capture it. ?:匹配表达式但不捕获它。
  • ?= Match a suffix but exclude it from capture. ?=匹配一个后缀但从捕获中排除它。
  • ?! Match if the suffix is absent.如果后缀不存在则匹配。

I tried using these in simple RegEx and got similar results for all.我尝试在简单的 RegEx 中使用这些,并得到了相似的结果。

For example: the following 3 expressions give very similar results.例如:以下 3 个表达式给出了非常相似的结果。

  • [a-zA-Z0-9._-]+@[a-zA-Z0-9-]+(?!\\.[a-zA-Z0-9]+)*
  • [a-zA-Z0-9._-]+@[a-zA-Z0-9-]+(?=\\.[a-zA-Z0-9]+)*
  • [a-zA-Z0-9._-]+@[a-zA-Z0-9-]+(?:\\.[a-zA-Z0-9]+)*

The difference between ?= and ?! ?=?!之间的区别is that the former requires the given expression to match and the latter requires it to not match.是前者要求给定的表达式匹配,而后者要求它匹配。 For example a(?=b) will match the "a" in "ab", but not the "a" in "ac".例如, a(?=b)将匹配“ab”中的“a”,但不匹配“ac”中的“a”。 Whereas a(?!b) will match the "a" in "ac", but not the "a" in "ab".a(?!b)将匹配“ac”中的“a”,但不匹配“ab”中的“a”。

The difference between ?: and ?= is that ?= excludes the expression from the entire match while ?: just doesn't create a capturing group.之间的差?:以及?=?=排除从整个匹配的表达,而?:只是不创建捕获组。 So for example a(?:b) will match the "ab" in "abc", while a(?=b) will only match the "a" in "abc".例如, a(?:b)将匹配“abc”中的“ab”,而a(?=b)将只匹配“abc”中的“a”。 a(b) would match the "ab" in "abc" and create a capture containing the "b". a(b)将匹配“abc”中的“ab”创建包含“b”的捕获。

?:  is for non capturing group
?=  is for positive look ahead
?!  is for negative look ahead
?<= is for positive look behind
?<! is for negative look behind

Please check here: http://www.regular-expressions.info/lookaround.html for very good tutorial and examples on lookahead in regular expressions.请在此处查看: http : //www.regular-expressions.info/lookaround.html有关正则表达式中前瞻的非常好的教程和示例。

To better understand let's apply the three expressions plus a capturing group and analyse each behaviour.为了更好地理解,让我们应用三个表达式加上一个捕获组并分析每个行为。

  • () capturing group - the regex inside the parenthesis must be matched and the match create a capturing group ()捕获组- 括号内的正则表达式必须匹配并且匹配创建一个捕获组
  • (?:) non-capturing group - the regex inside the parenthesis must be matched but does not create the capturing group (?:)非捕获组- 括号内的正则表达式必须匹配但不创建捕获组
  • (?=) positive lookahead - asserts that the regex must be matched (?=)正向预测- 断言必须匹配正则表达式
  • (?!) negative lookahead - asserts that it is impossible to match the regex (?!)负前瞻- 断言不可能匹配正则表达式

Let's apply q(u)i to quit .让我们应用q(u)i退出 q matches q and the capturing group u matches u . q匹配q并且捕获组u匹配u The match inside the capturing group is taken and a capturing group is created.获取捕获组内的匹配并创建捕获组。 So the engine continues with i .所以引擎继续i And i will match i . i会匹配i This last match attempt is successful.最后一次匹配尝试成功。 qui is matched and a capturing group with u is created. qui匹配并创建了一个带有u的捕获组。

Let's apply q(?:u)i to quit .让我们应用q(?:u)i退出 Again, q matches q and the non-capturing group u matches u .同样, q匹配q并且非捕获组u匹配u The match from the non-capturing group is taken, but the capturing group is not created.获取来自非捕获组的匹配项,但不创建捕获组。 So the engine continues with i .所以引擎继续i And i will match i . i会匹配i This last match attempt is successful.最后一次匹配尝试成功。 qui is matched qui匹配

Let's apply q(?=u)i to quit .让我们应用q(?=u)i退出 The lookahead is positive and is followed by another token.前瞻是正的,后面跟着另一个标记。 Again, q matches q and u matches u .同样, q匹配q并且u匹配u Again, the match from the lookahead must be discarded, so the engine steps back from i in the string to u .同样,来自前瞻的匹配必须被丢弃,因此引擎从字符串中的i退回到u The lookahead was successful, so the engine continues with i .前瞻成功,因此引擎继续i But i cannot match u .i配不上 So this match attempt fails.所以这次匹配尝试失败了。

Let's apply q(?=u)u to quit .让我们应用q(?=u)u退出 The lookahead is positive and is followed by another token.前瞻是正的,后面跟着另一个标记。 Again, q matches q and u matches u .同样, q匹配q并且u匹配u The match from the lookahead must be discarded, so the engine steps back from u in the string to u .从先行的比赛必须被丢弃,所以发动机的步骤从后u串到u英寸The lookahead was successful, so the engine continues with u .前瞻成功,因此引擎继续使用u And u will match u . u会匹配 So this match attempt is successful.所以这次比赛尝试是成功的。 qu is matched qu匹配

Let's apply q(?!i)u to quit .让我们应用q(?!i)u退出 Even in this case lookahead is positive (because i does not match) and is followed by another token.即使在这种情况下,lookahead 也是正数(因为i不匹配)并且后面跟着另一个标记。 Again, q matches q and i doesn't match u .同样, q匹配qi不匹配u The match from the lookahead must be discarded, so the engine steps back from u in the string to u .从先行的比赛必须被丢弃,所以发动机的步骤从后u串到u英寸The lookahead was successful, so the engine continues with u .前瞻成功,因此引擎继续使用u And u will match u . u会匹配 So this match attempt is successful.所以这次比赛尝试是成功的。 qu is matched qu匹配

So, in conclusion, the real difference between lookahead and non-capturing groups is all about if you want just to test the existence or test and save the match.因此,总而言之,前瞻组和非捕获组之间的真正区别在于您是否只想测试存在或测试并保存匹配。 Capturing groups are expensive so use it judiciously.捕获组的成本很高,因此请谨慎使用。

Try matching foobar against these:尝试将foobar与这些匹配:

/foo(?=b)(.*)/
/foo(?!b)(.*)/

The first regex will match and will return "bar" as first submatch — (?=b) matches the 'b', but does not consume it, leaving it for the following parentheses.第一个正则表达式将匹配并返回“bar”作为第一个子匹配 - (?=b)匹配 'b',但不消耗它,将其保留在以下括号中。

The second regex will NOT match, because it expects "foo" to be followed by something different from 'b'.第二个正则表达式将不匹配,因为它期望“foo”后跟与“b”不同的东西。

(?:...) has exactly the same effect as simple (...) , but it does not return that portion as a submatch. (?:...)与 simple (...) (?:...)具有完全相同的效果,但它不会将该部分作为子匹配返回。

The simplest way to understand assertions is to treat them as the command inserted into a regular expression.理解断言的最简单方法是将它们视为插入到正则表达式中的命令。 When the engine runs to an assertion, it will immediately check the condition described by the assertion.当引擎运行到断言时,它会立即检查断言描述的条件。 If the result is true, then continue to run the regular expression.如果结果为真,则继续运行正则表达式。

This is the real difference:这是真正的区别:

>>> re.match('a(?=b)bc', 'abc')
<Match...>
>>> re.match('a(?:b)c', 'abc')
<Match...>

# note:
>>> re.match('a(?=b)c', 'abc')
None

If you dont care the content after "?:" or "?=", "?:" and "?=" are just the same.如果你不关心“?:”或“?=”后面的内容,“?:”和“?=”是一样的。 Both of them are ok to use.两者都可以使用。

But if you need those content for further process(not just match the whole thing. In that case you can simply use "a(b)") You have to use "?=" instead.但是,如果您需要这些内容进行进一步处理(不仅仅是匹配整个内容。在这种情况下,您可以简单地使用“a(b)”),您必须使用“?=”来代替。 Cause "?:"will just through it away.原因“?:”只会通过它而消失。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM