简体   繁体   中英

Perl remove multiple line that match regex

I have a file that looks like this:

*
TEST CASE1,
$ some text unque633
PLACEMENT 123
*
TEST CASE2,
$ some text unque759
PLACEMENT 321
*
TEST CASE3,
$ some text unque966
PLACEMENT 856
*

I want to remove multiple lines that match regex. For example, need to remove starting from TEST CASE2 until the line begining with * . How can this be done within perl script. Also, how would I edit text from TEST CASE2 if I only know unque759 . Much appreciated.

Desired goal can be easily achieved with following algorithm: read all data into a variable, substitute block 'TEST CASE2.....' before next '*' to nothing, output result

use strict;
use warnings;
use feature 'say';

my $data = do { local $/; <DATA> };     # read all data at once

$data =~ s/TEST CASE2[^*]*//s;          # substitute requested block with nothing

say $data;

__DATA__
*
TEST CASE1,
$ some text
PLACEMENT 123
*
TEST CASE2,
$ some text
PLACEMENT 321
*
TEST CASE3,
$ some text
PLACEMENT 856
*

Output

*
TEST CASE1,
$ some text
PLACEMENT 123
*
*
TEST CASE3,
$ some text
PLACEMENT 856
*

This will remove blocks beginning with TEST CASE2, , ending with * and containing unque759

cat file.txt 
*
TEST CASE1,
$ some text unque633
PLACEMENT 123
*
TEST CASE2,
$ some text unque759
PLACEMENT 321
*
TEST CASE2,
$ some text unque999
PLACEMENT 321
*
TEST CASE3,
$ some text unque966
PLACEMENT 856
*

perl -0777 -ape 's/TEST CASE2,[^*]+?\bunque759\b[^*]+?\*(?:\R|\z)//' file.txt 
*
TEST CASE1,
$ some text unque633
PLACEMENT 123
*
TEST CASE2,
$ some text unque999
PLACEMENT 321
*
TEST CASE3,
$ some text unque966
PLACEMENT 856
*

Explanation:

-0777            # “slurp” mode, read the file in a single string
s/               # substitute
    TEST CASE2,     # literally
    [^*]+?          # 1 or more non asterisk, not greedy
    \b              # word boundary
    unque759        # literally
    \b              # word boundary
    [^*]+?          # 1 or more non asterisk, not greedy
    \*              # an asterisk
    (?:\R|\z)       # non capture group, end of line OR end of string
//               with nothing

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM