简体   繁体   English

PHP正则表达式与安全分隔符

[英]Php regex with safe delimiters

I've thought that php's perl compatible regular expression (preg library) supports curly brackets as delimiters. 我认为php的perl兼容正则表达式(preg库)支持使用花括号作为定界符。 This should be fine: 这应该很好:

{ello {world}i // should match on Hello {World

The main point of curly brackets is that it only takes the most left and right ones, thus requiring no escaping for the inner ones. 大括号的要点是,它只需要最左边和最右边的一个,因此不需要转义内部的括号。 As far as I know, php requires the escaping 据我所知,php需要转义

{ello \{world}i // this actually matches on Hello {World

Is this the expected behavior or bug in php preg implementation? 这是php preg实现中的预期行为或错误吗?

Expected behavior as far as I know, otherwise how else would the compiler allow group limiters? 据我所知,预期的行为,否则编译器将如何允许组限制器? eg 例如

[a-z]{1,5}

From http://lv.php.net/manual/en/regexp.reference.delimiters.php : http://lv.php.net/manual/en/regexp.reference.delimiters.php中

If the delimiter needs to be matched inside the pattern it must be escaped using a backslash. 如果定界符需要在模式内匹配,则必须使用反斜杠对其进行转义。 If the delimiter appears often inside the pattern, it is a good idea to choose another delimiter in order to increase readability. 如果定界符经常出现在模式中,则最好选择另一个定界符以提高可读性。

So this is expected behavior, not a bug. 因此,这是预期的行为,而不是错误。

When in Perl you use for the pattern delimiter any of the four paired ASCII bracket types, you only need to escape unpaired brackets within the pattern. 在Perl中,将四个成对的ASCII括号类型中的任何一种用作模式定界符时,只需在模式中转义未配对的括号即可。 This is indeed the entire purpose of using brackets. 这确实是使用方括号的全部目的。 This is documented in the perlop manpage under “Quote and Quote-like Operators”, which reads in part: 这在perlop联机帮助页的“报价和类似报价的运算符”下有记录,其部分内容如下:

   Non-bracketing delimiters use the same character fore and aft, 
   but the four sorts of brackets (round, angle, square, curly) 
   will all nest, which means that

      q{foo{bar}baz}

   is the same as

      'foo{bar}baz'

   Note, however, that this does not always work for quoting Perl code:

      $s = q{ if($a eq "}") ... }; # WRONG

That's why you often see people use m{…} or qr{…} in Perl code, especially for multiline patterns used with /x ᴀᴋᴀ (?x) . 这就是为什么您经常看到人们在Perl代码中使用m{…}qr{…} ,特别是对于与/x xᴀᴋᴀ (?x)一起使用的多行模式。 For example: 例如:

return qr{                  
    (?=                     # pure lookahead for conjunctive matching
        \A                  # always from start
        . *?                # going only as far as we need to to find the pattern
        (?:
            ${case_flag}
            ${left_boundary}
            ${positive_pattern}
            ${right_boundary}
        )
    )
}sxm;

Notice how those nested braces are no problem. 请注意,这些嵌套的括号是没有问题的。

I found that no escaping is required in this case: 我发现在这种情况下不需要转义:

'ello {world'i
(ello {world)i

So my theory is, that the problem is with the '{' delimiters only. 所以我的理论是,问题仅在于'{'分隔符。 Also, the following two produce the same error: 此外,以下两个产生相同的错误:

{ello {world}i
(ello (world)i

Using starting/ending braces as delimiters may require to escape the given braces in the expression. 使用开始/结束大括号作为分隔符可能需要对表达式中的给定大括号进行转义。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM