简体   繁体   English

PHP 5.6正则表达式意外行为

[英]PHP 5.6 regex unexpected behaviour

I have come across a strange behaviour in PHP 5.6 (not tested with other versions) 我在PHP 5.6中遇到了一个奇怪的行为(未经其他版本测试)

var_dump(preg_match('#\b(39||90)\b#', '42')); // int(1)
var_dump(preg_match('#\b(39||90)\b#', '')); // int(0)

https://regex101.com says the pattern \\b(39||90)\\b is invalid but PHP preg_match does not return FALSE as it should if the pattern is invalid. https://regex101.com表示模式\\b(39||90)\\b无效,但PHP preg_match不会返回FALSE,如果模式无效则应返回FALSE。

As you can see 42 produces a match and the empty string produces a non-match. 如您所见42产生一个匹配项,空字符串产生一个不匹配项。 I'd expect the other way round as || 我希望反过来是|| should stand for empty string. 应该代表空字符串。

What's happening here? 这里发生了什么事?

This regex: 此正则表达式:

\b(39||90)\b

Will return a successful match if any of the alternation is matched. 如果任何交替匹配,将返回成功的匹配。 These are: 这些是:

  1. Complete word 39 完成字39
  2. Complete word 90 完成字90
  3. A word boundary anywhere in the input (because of empty || ) 输入中任何位置的单词边界(由于||留空)

However in empty string there is no word boundary. 但是,在空字符串中没有单词边界。 A word boundary \\b is asserted true between a word \\w and a non-word \\W . 单词\\w和非单词\\W之间的单词边界\\b被断言为真。

Eg see these results: 例如,看到以下结果:

// no word character hence false
var_dump(preg_match('#\b(39||90)\b#', '#@'));
int(0)

# a word char hence true
php > var_dump(preg_match('#\b(39||90)\b#', 'a'));
int(1)

// no word character hence false
php > var_dump(preg_match('#\b(39||90)\b#', "\t\n"));
int(0)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM