簡體   English   中英

正則表達式多行匹配,不包括包含字符串的行

[英]Regex multiline match excluding lines containing a string

在以下正則表達式中:

EXCLUDE this entire line
include this line
and this as single match
and EXCLUDE this line

我想返回一個包含兩行的匹配:

include this line
and this as single match

我想使用EXCLUDE作為字符串,標識不應包含整行。

編輯:如果我可以將第一個匹配項與“EXCLUDE”(或文檔末尾以先發生者為准)匹配,那也可以

您可以在正則表達式的匹配項上拆分字符串

^.*\bEXCLUDE\b.*\R

設置了全局和多行標志。

例如,在 Ruby 中,如果變量str保存了字符串

Firstly include this line
EXCLUDE this entire line
include this line
and this as single match
and EXCLUDE this line
Lastly include this line

那么String#split方法可用於生成包含三個字符串的數組。

str.split(/^.*\bEXCLUDE\b.*\R/)
  #=> ["Firstly include this line",
  #    "include this line\nand this as single match",
  #    "Lastly include this line"]

許多語言都有與 Ruby 的split相當的方法或函數。

演示

正則表達式可以分解如下。

^        # match the beginning of a line
.*       # match zero or more characters other than line
         # terminators, as many as possible
\b       # match word boundary
EXCLUDE  # match literal
\b       # match word boundary
.*       # match zero or more characters other than line
         # terminators, as many as possible
\R       # match line terminator
 

使用 pcre 您可以使用\K來獲取到目前為止匹配的內容,並首先匹配包含排除的行:

^.*\bEXCLUDE\b.*\K(?:\R(?!.*\bEXCLUDE\b).*)+

正則表達式演示

如果要匹配所有不包含排除的行,則使用連續行:

(?:(?:^|\R)(?!.*\bEXCLUDE\b).*)+

正則表達式演示

或使用跳過失敗方法:

^.*\bEXCLUDE\b.*\R(*SKIP)(*F)|.+(?:\R(?!.*\bEXCLUDE\b).*)*

正則表達式演示

您還可以將這些行與EXCLUDE匹配,並使用它將您的文本拆分為您要查找的塊:

<?php

$input = 'First include this line
EXCLUDE this entire line
include this line
and this as single match
and EXCLUDE this line
Lastly include this line';

// ^ matches the beginning of a line.
// .* matches anything (except new lines) zero or multiple times.
// \b matches a word boundary (to avoid matching NOEXCLUDE).
// $ matches the end of a line.
$pattern = '/^.*\bEXCLUDE\b.*$/m';

// Split the text with all lines containing the EXCLUDE word.
$desired_blocks = preg_split($pattern, $input);

// Get rid of the new lines around the matched blocks.
array_walk(
    $desired_blocks,
    function (&$block) {
        // \R matches any Unicode newline sequence.
        // ^ matches the beginning of the string.
        // $ matches the end of the string.
        // | = or
        $block = preg_replace('/^\R+|\R+$/', '', $block);
    }
);

var_export($desired_blocks);

在這里演示: https ://onlinephp.io/c/4216a

輸出:

array (
  0 => 'First include this line',
  1 => 'include this line
and this as single match',
  2 => 'Lastly include this line',
)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM