简体   繁体   English

`regex {n,}?`==`正则表达式{n}`?

[英]`regex{n,}?` == `regex{n}`?

-edit- NOTE the ? -edit-注意? at the end of .{2,}? 在... .{2,}?

I found out you can write 我发现你可以写

.{2,}?

Isnt that exactly the same as below? 不完全和下面一样吗?

.{2}

No. {2,} means two times or more while {2} means exactly two times. 编号{2,}表示两次或更多次,而{2}表示正好两次。 Quantifiers are greedy by default, so given the string foo you would get foo if you use .{2,} , but fo if you use .{2,}? 量词是由默认的贪婪,所以给出的字符串foo你会得到foo如果使用.{2,}fo如果使用.{2,}? because you made it lazy. 因为你做得很懒 However, the latter is allowed to match more than two times if necessary, but .{2} always means exactly two characters. 然而,后者被允许如果需要的话,以匹配两倍以上,但.{2}总是意味着正好两个字符。

So if you have the string test123 and the pattern .{2,}?\\d , you would get test1 because it has to match up to four characters so the \\d can also match. 因此,如果您有字符串test123和模式.{2,}?\\d ,您将获得test1因为它必须匹配最多四个字符,因此\\d也可以匹配。

No, they are different. 不,他们是不同的。 ^.{2,}?$ matches strings whose length is at least 2 ( as seen on rubular.com ): ^.{2,}?$匹配长度至少为 2的字符串( 如rubular.com上所示 ):

12
123
1234

By contrast, ^.{2}$ only matches strings whose length is exactly 2 ( as seen on rubular.com ). 相比之下, ^.{2}$仅匹配长度正好为 2的字符串( 如rubular.com上所示 )。

It's correct that being reluctant, .{2,}? 这是不正确的.{2,}? will first attempt to match only two characters. 将首先尝试仅匹配两个字符。 But for the overall pattern to match, it can take more. 但是要匹配整体模式 ,可能需要更多。 This is not the case with .{2} , which can only match exactly 2 characters. 情况并非如此.{2} ,它只能匹配2个字符。

References 参考

Related questions 相关问题

In isolation they probably behave identical but not inside larger expressions because the lazy version is allowed to match more than two symbols. 在隔离中,它们可能表现相同但不在较大的表达式中,因为允许惰性版本匹配两个以上的符号。

             abx        abcx

^.{2,}?x$    match      match
^.{2}x$      match      no match

What makes this question especially interesting is that there are times when .{2,}? 是什么让这个问题特别有意思的是, 有些时候.{2,}? is equivalent to .{2} , but it should never happen. 相当于.{2} ,但绝不应该发生。 Others have already pointed out how a reluctant quantifier at the very end of a regex always matches the minimum number of of characters because there's nothing after it to force it to consume more. 其他人已经指出了正则表达式最后的一个不情愿的量词总是如何匹配最小字符数,因为在它之后没有任何东西强迫它消耗更多。

The other place they shouldn't be used is at the end of a subexpression inside an atomic group . 不应该使用的另一个地方是在原子组内的子表达式的末尾。 For example, suppose you try to match foo bar with 例如,假设您尝试匹配foo bar

f(?>.+?) bar

The subexpression initially consumes the first 'o' and hands off to the next part, which tries unsuccessfully to match a space. 子表达式最初消耗第一个“o”并切换到下一个部分,这会尝试匹配空间失败。 Without the atomic group, it would backtrack and let the .+? 没有原子团,它会回溯并让.+? consume another character. 消耗另一个角色。 But it can't backtrack into the atomic group, and there's no wiggle room before the group, so the match attempt fails. 但它无法回溯到原子组,并且在组之前没有摆动空间,因此匹配尝试失败。

A reluctant quantifier at the end of a regex or at end of an atomic subexpression is definite code smell. 在正则表达式结束时或在原子子表达式结束时不情愿的量词是明确的代码气味。

Not exactly Using PHP to do a regexp match and display the capture 不完全使用PHP进行正则表达式匹配并显示捕获

$string = 'aaabbaabbbaaa';

$search = preg_match_all('/b{2}a/',$string,$matches,PREG_SET_ORDER );

echo '<pre>';
var_dump($matches);
echo '</pre>';

$search = preg_match_all('/b{2,}?a/',$string,$matches,PREG_SET_ORDER );

echo '<pre>';
var_dump($matches);
echo '</pre>';

First result gives: 第一个结果给出:

array(2) {
  [0]=>
  array(1) {
    [0]=>
    string(3) "bba"
  }
  [1]=>
  array(1) {
    [0]=>
    string(3) "bba"
  }
}

second gives: 第二个给出:

array(2) {
  [0]=>
  array(1) {
    [0]=>
    string(3) "bba"
  }
  [1]=>
  array(1) {
    [0]=>
    string(4) "bbba"
  }
}

With b{2} the capture only returns 2 b's, with b{2,} it returns 2 or more 使用b {2},捕获仅返回2 b,使用b {2,}返回2或更多

x.{2,}?x matches "xasdfx" in "xasdfxbx" but x.{2}x does not match at all. x.{2,}?x匹配"xasdfx"中的"xasdfxbx"x.{2}x根本不匹配。

Without the trailing ? 没有尾随? , the first one will match the whole string. ,第一个将匹配整个字符串。

No, they are different : 不,他们是不同的:

.{2,}? : Any character, at least 2 repetitions, as few as possible :任何角色,至少重复2次,尽可能少

.{2} : Any character, exactly 2 repetitions .{2} :任何角色,正好是2次重复

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM