简体   繁体   中英

Omitting or excluding Regular Expression matches from a Perl script

Hi I want to search something in the file which looks similar to this :

Start Cycle
report 1
report 2
report 3
report 4
End Cycle

.... goes on and on..

I want to search for "Start Cycle" and then pull out report 1 and report 3 from it.. My regex looks something like this

(Start Cycle .*\n)(.*\n)(.*\n)(.*\n)

The above regex select Start Cycle and the next three lines.. But i want to omit the thrid line from my result. Is that possible? Or any easier perl script can be done?? I am expecting a result like :

Start Cycle
report 1
report 3

The following code prints the odd-numbered lines between Start Cycle and End Cycle :

foreach (<$filehandle>) {
    if (/Start Cycle/ .. /End Cycle/) {
        print if /report (\d+)/ and $1 % 2;
    }
}

You can find text between start and end markes then split context by lines. Here is example:

my $text = <<TEXT;
Start Cycle
report 1
report 2
report 3
report 4
End Cycle
TEXT

## find text between all start/end pairs
while ($text =~ m/^Start Cycle$(.*?)^End Cycle$/msg) {
    my $reports_text = $1;
    ## remove leading spaces
    $reports_text =~ s/^\s+//;
    ## split text by newlines
    my @report_parts = split(/\r?\n/m, $reports_text);
}

Perhaps a crazy way to do it: alter Perl's understanding of an input record.

$/ = "End Cycle\n";
print( (/(.+\n)/g)[0,1,3] ) while <$file_handle>;

The regex populates $1, $2, $3 and $4 with the contents of each pair of brackets.

So if you just look at the contents of $1, $2 and $4 you have what you want.

Alternatively you can just leave off the brackets from the third line.

Your regex should look something like

/Start Cycle\n(.+)\n.+\n(.+)\n.+\nEnd Cycle/g

The /g will allow you to evaluate the regex repeatedly and always get the next match every time.

如果您希望保留所有周围的代码不变,但是停止捕获第三件事,则只需删除导致捕获该行的括号:

(Start Cycle .*\n)(.*\n).*\n(.*\n)

I took the OP's question as a Perl exercise and came up with the following code. It was just written for learning purposes. Kindly correct me if anything looks suspicious.

while(<>) {
   if(/Start Cycle/) {
        push @block,$_;
        push @block, scalar<> for 1..3;               
        print @block[0,1,3];
        @block=(); 
           }
        }

Another version (edited and thanks,@FM):

local $/;
$_ = <>;
  @block = (/(Start Cycle\n)(.+\n).+\n(.+\n)/g);
  print @block;

Update: I did not originally notice that this was just @FM's answer in a slightly more robust and longer form.

#!/usr/bin/perl

use strict; use warnings;

{
    local $/ = "End Cycle\n";
    while ( my $block = <DATA> ) {
        last unless my ($heading) = $block =~ /^(Start Cycle\n)/g;
        print $heading, ($block =~ /([^\n]+\n)/g)[1, 3];
    }
}

__DATA__
Start Cycle
report 1
report 2
report 3
report 4
End Cycle

Output:

Start Cycle
report 1
report 3
while (<>) {
    if (/Start Cycle/) {
        print $_;
        $_ = <>;
        print $_;
        $_ = <>; $_ = <>;
        print $_;
    }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM