繁体   English   中英

正则表达式多行匹配,不包括包含字符串的行

[英]Regex multiline match excluding lines containing a string

在以下正则表达式中:

EXCLUDE this entire line
include this line
and this as single match
and EXCLUDE this line

我想返回一个包含两行的匹配:

include this line
and this as single match

我想使用EXCLUDE作为字符串,标识不应包含整行。

编辑:如果我可以将第一个匹配项与“EXCLUDE”(或文档末尾以先发生者为准)匹配,那也可以

您可以在正则表达式的匹配项上拆分字符串

^.*\bEXCLUDE\b.*\R

设置了全局和多行标志。

例如,在 Ruby 中,如果变量str保存了字符串

Firstly include this line
EXCLUDE this entire line
include this line
and this as single match
and EXCLUDE this line
Lastly include this line

那么String#split方法可用于生成包含三个字符串的数组。

str.split(/^.*\bEXCLUDE\b.*\R/)
  #=> ["Firstly include this line",
  #    "include this line\nand this as single match",
  #    "Lastly include this line"]

许多语言都有与 Ruby 的split相当的方法或函数。

演示

正则表达式可以分解如下。

^        # match the beginning of a line
.*       # match zero or more characters other than line
         # terminators, as many as possible
\b       # match word boundary
EXCLUDE  # match literal
\b       # match word boundary
.*       # match zero or more characters other than line
         # terminators, as many as possible
\R       # match line terminator
 

使用 pcre 您可以使用\K来获取到目前为止匹配的内容,并首先匹配包含排除的行:

^.*\bEXCLUDE\b.*\K(?:\R(?!.*\bEXCLUDE\b).*)+

正则表达式演示

如果要匹配所有不包含排除的行,则使用连续行:

(?:(?:^|\R)(?!.*\bEXCLUDE\b).*)+

正则表达式演示

或使用跳过失败方法:

^.*\bEXCLUDE\b.*\R(*SKIP)(*F)|.+(?:\R(?!.*\bEXCLUDE\b).*)*

正则表达式演示

您还可以将这些行与EXCLUDE匹配,并使用它将您的文本拆分为您要查找的块:

<?php

$input = 'First include this line
EXCLUDE this entire line
include this line
and this as single match
and EXCLUDE this line
Lastly include this line';

// ^ matches the beginning of a line.
// .* matches anything (except new lines) zero or multiple times.
// \b matches a word boundary (to avoid matching NOEXCLUDE).
// $ matches the end of a line.
$pattern = '/^.*\bEXCLUDE\b.*$/m';

// Split the text with all lines containing the EXCLUDE word.
$desired_blocks = preg_split($pattern, $input);

// Get rid of the new lines around the matched blocks.
array_walk(
    $desired_blocks,
    function (&$block) {
        // \R matches any Unicode newline sequence.
        // ^ matches the beginning of the string.
        // $ matches the end of the string.
        // | = or
        $block = preg_replace('/^\R+|\R+$/', '', $block);
    }
);

var_export($desired_blocks);

在这里演示: https ://onlinephp.io/c/4216a

输出:

array (
  0 => 'First include this line',
  1 => 'include this line
and this as single match',
  2 => 'Lastly include this line',
)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM