简体   繁体   English

PHP正则表达式:如何匹配\\ r和\\ n而不使用[\\ r \\ n]?

[英]PHP Regex: How to match \r and \n without using [\r\n]?

I have tested \\v (vertical white space) for matching \\r\\n and their combinations, but I found out that \\v does not match \\r and \\n . 我已经测试了\\v (垂直空白)以匹配\\r\\n及其组合,但我发现\\v\\r\\n不匹配。 Below is my code that I am using.. 以下是我正在使用的代码..

$string = "
Test
";

if (preg_match("#\v+#", $string )) {
  echo "Matched";
} else {
  echo "Not Matched";
}

To be more clear, my question is, is there any other alternative to match \\r\\n ? 更清楚的是,我的问题是,是否还有其他选择匹配\\r\\n

PCRE and newlines PCRE和换行符

PCRE has a superfluity of newline related escape sequences and alternatives. PCRE具有多种与换行有关的转义序列和替代方案。

Well, a nifty escape sequence that you can use here is \\R . 好吧,你可以在这里使用的一个漂亮的转义序列是\\R By default \\R will match Unicode newlines sequences, but it can be configured using different alternatives. 默认情况下\\R将匹配Unicode换行符序列,但可以使用不同的替代方案进行配置。

To match any Unicode newline sequence that is in the ASCII range. 匹配ASCII范围内的任何Unicode换行序列。

preg_match('~\R~', $string);

This is equivalent to the following group: 这相当于以下组:

(?>\r\n|\n|\r|\f|\x0b|\x85)

To match any Unicode newline sequence; 匹配任何Unicode换行序列; including newline characters outside the ASCII range and both the line separator ( U+2028 ) and paragraph separator ( U+2029 ), you want to turn on the u ( unicode ) flag. 包括ASCII范围之外的换行符以及行分隔符( U+2028 )和段落分隔符( U+2029 ),您要打开uunicode )标志。

preg_match('~\R~u', $string);

The u ( unicode ) modifier turns on additional functionality of PCRE and Pattern strings are treated as ( UTF-8 ). uunicode )修饰符打开PCRE的附加功能,Pattern字符串被视为( UTF-8 )。

The is equivalent to the following group: 相当于以下组:

(?>\r\n|\n|\r|\f|\x0b|\x85|\x{2028}|\x{2029})

It is possible to restrict \\R to match CR , LF , or CRLF only: 可以限制\\R仅匹配CRLFCRLF

preg_match('~(*BSR_ANYCRLF)\R~', $string);

The is equivalent to the following group: 相当于以下组:

(?>\r\n|\n|\r)

Additional 额外

Five different conventions for indicating line breaks in strings are supported: 支持五种不同的约定来指示字符串中的换行符:

(*CR)        carriage return
(*LF)        linefeed
(*CRLF)      carriage return, followed by linefeed
(*ANYCRLF)   any of the three above
(*ANY)       all Unicode newline sequences

Note : \\R does not have special meaning inside of a character class. 注意\\R在字符类中没有特殊含义。 Like other unrecognized escape sequences, it is treated as the literal character "R" by default. 与其他无法识别的转义序列一样,默认情况下它被视为文字字符“R”。

This doesn't answer the question for alternatives, because \\v works perfectly well 这并没有回答替代方案的问题,因为\\v非常有效

\\v matches any character considered vertical whitespace; \\v匹配任何被认为是垂直空格的字符; this includes the platform's carriage return and line feed characters (newline) plus several other characters, all listed in the table below. 这包括平台的回车符和换行符(换行符)以及其他几个字符,全部列在下表中。

You only need to change "#\\v+#" to either 您只需要将"#\\v+#"更改为

  • "#\\\\v+#" escape the backslash "#\\\\v+#"逃脱了反斜杠

or 要么

  • '#\\v+#' use single quotes '#\\v+#'使用单引号

In both cases, you will get a match for any combination of \\r and \\n . 在这两种情况下,您将获得\\r\\n任意组合的匹配项。

Update: 更新:

Just to make the scope of \\v clear in comparison to \\R , from perlrebackslash perlrebackslash开始 ,只需将\\v的范围与\\R相比清楚

  • \\R \\ r
    \\R matches a generic newline; \\R匹配通用换行符; that is, anything considered a linebreak sequence by Unicode. 也就是说,任何被Unicode视为换行序列的东西。 This includes all characters matched by \\v (vertical whitespace), ... 这包括\\v (垂直空白) 匹配的所有字符 ,......

If there is some strange requirement that prevents you from using a literal [\\r\\n] in your pattern, you can always use hexadecimal escape sequences instead: 如果有一些奇怪的要求阻止您在模式中使用文字[\\r\\n] ,则可以始终使用十六进制转义序列:

preg_match('#[\xD\xA]+#', $string)

This is pattern is equivalent to [\\r\\n]+ . 这种模式相当于[\\r\\n]+

To match every LINE of a given String, simple use the ^$ Anchors and advice your regex engine to operate in multi-line mode. 要匹配给定String的每个LINE,只需使用^$ Anchors并建议您的正则表达式引擎在多行模式下运行。 Then ^$ will match the start and end of each line, instead of the whole strings start and end. 然后^$将匹配每一行的开始和结束,而不是整个字符串的开始和结束。

http://php.net/manual/en/reference.pcre.pattern.modifiers.php http://php.net/manual/en/reference.pcre.pattern.modifiers.php

in PHP, that would be the m modifier after the pattern. 在PHP中,这将是模式之后的m修饰符。 /^(.*?)$/m will simple match each line, seperated by any vertical space inside the given string. /^(.*?)$/m将简单匹配每一行,由给定字符串内的任何垂直空格分隔。

Btw: For line-Splitting, you could also use split() and the PHP_EOL constant: 顺便说一下:对于line-Splitting,你也可以使用split()PHP_EOL常量:

$lines = explode(PHP_EOL, $string);

The problem is that you need the multiline option, or dotall option if using dot. 问题是您需要多行选项,或者如果使用点,则需要dotall选项。 It goes at the end of the delimiter. 它在分隔符的末尾。

http://www.php.net/manual/en/regexp.reference.internal-options.php http://www.php.net/manual/en/regexp.reference.internal-options.php

$string = "
Test
";
if(preg_match("#\v+#m", $string ))
echo "Matched";
else
echo "Not Matched";

To match a newline in PHP, use the php constant PHP_EOL . 要匹配PHP中的换行符,请使用php常量PHP_EOL This is crossplatform. 这是跨平台。

if (preg_match('/\v+' . PHP_EOL ."/", $text, $matches ))
   print_R($matches );

This regex also matches newline \\n and carriage return \\r characters. 此正则表达式还匹配换行符\\n和回车符\\r \\n字符。

(?![ \t\f])\s

DEMO DEMO

To match one or more newline or carriage return characters, you could use the below regex. 要匹配一个或多个换行符或回车符,可以使用以下正则表达式。

(?:(?![ \t\f])\s)+

DEMO DEMO

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM