简体   繁体   English

有没有办法在多行上编写 REGEX 模式?

[英]Is there a way to write a REGEX pattern over multiple lines?

I often end up with ultra-complex and long regexps.我经常以超复杂和长的正则表达式告终。 PCRE @ PHP. PCRE@PHP。

For a long time, I've been searching and looking for a way to do something like:很长一段时间以来,我一直在寻找并寻找一种方法来做类似的事情:

    preg_match('#blablabla...
blablablablablabla...
blablablablablabla...
blablablablablabla...
blablablablablabla...
blablablablablabla...
blablablablablabla...
blablablablablabla...
blablablablablabla...
blablablablablabla...
blablablablablabla...
blablabla#uis');

Instead of:代替:

preg_match('#blablabla...blablablablablabla...blablablablablabla...blablablablablabla...blablablablablabla...blablablablablabla...blablablablablabla...blablablablablabla...blablablablablabla...blablablablablabla...blablablablablabla...blablabla#uis');

If I make actual linebreaks, that will become part of the regular expression.如果我进行实际的换行,那将成为正则表达式的一部分。 Perhaps not as an actual linebreak, but as whitespace.也许不是作为实际的换行符,而是作为空格。 Unless I'm completely mistaken.除非我完全弄错了。

Is there some character I can use in the end of each row to say: "this is supposed to all be one line"?我可以在每一行的末尾使用一些字符来说:“这应该都是一行”?

You can use a HEREDOC that supports variable interpolation (or NOWDOC that does not support variable interpolation) with an x flag (modifier) .您可以使用支持变量插值的HEREDOC (或不支持变量插值的NOWDOC )和x标志(修饰符) See what the docs say about the quantifier:查看文档对量词的评价:

x (PCRE_EXTENDED) x (PCRE_EXTENDED)
If this modifier is set, whitespace data characters in the pattern are totally ignored except when escaped or inside a character class, and characters between an unescaped # outside a character class and the next newline character, inclusive, are also ignored.如果设置了此修饰符,则模式中的空白数据字符将被完全忽略,除非转义或在字符 class 内,并且在字符 class 和下一个换行符之间的字符也将被忽略。 This is equivalent to Perl's /x modifier, and makes it possible to include commentary inside complicated patterns.这相当于 Perl 的 /x 修饰符,并且可以在复杂的模式中包含注释。 Note, however, that this applies only to data characters.但是请注意,这仅适用于数据字符。 Whitespace characters may never appear within special character sequences in a pattern, for example within the sequence (?( which introduces a conditional subpattern.空白字符可能永远不会出现在模式中的特殊字符序列中,例如在序列 (?( 它引入了条件子模式。

// HEREDOC
$pattern_with_interpolation = <<<EOD
/
blablabla...  # comment here
blablabla     # comment here
/uisx
EOD;

// NOWDOC
$pattern_without_interpolation = <<<'EOD'
/blablabla... # comment here
blablabla     # comment here
/uisx
EOD;

Mind that you need to escape all # and literal whitespace chars in the pattern since /x flag allows using comments at the end of a line after # and insert any literal whitespace with formatting meaning, they do not match the corresponding chars.注意,您需要转义模式中的所有#和文字空白字符,因为/x标志允许在#之后的行尾使用注释并插入任何具有格式含义的文字空白,它们与相应的字符不匹配。

Example例子

$pattern_without_interpolation = <<<'EOD'
/
\d+      # one or more digits
\        # a single space
\p{L}+   # one or more letters
\#       # a literal hash symbol
/ux
EOD;
if (preg_match($pattern_without_interpolation, '1 pound#', $m)) {
    echo $m[0];
}
// => 1 pound#

See the PHP demo .请参阅PHP 演示

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM