Omitting or excluding Regular Expression matches from a Perl script

Question

Hi I want to search something in the file which looks similar to this :

Start Cycle
report 1
report 2
report 3
report 4
End Cycle

.... goes on and on..

I want to search for "Start Cycle" and then pull out report 1 and report 3 from it.. My regex looks something like this

(Start Cycle .*\n)(.*\n)(.*\n)(.*\n)

The above regex select Start Cycle and the next three lines.. But i want to omit the thrid line from my result. Is that possible? Or any easier perl script can be done?? I am expecting a result like :

Start Cycle
report 1
report 3

Answer 1

The following code prints the odd-numbered lines between Start Cycle and End Cycle :

foreach (<$filehandle>) {
    if (/Start Cycle/ .. /End Cycle/) {
        print if /report (\d+)/ and $1 % 2;
    }
}

Answer 2

You can find text between start and end markes then split context by lines. Here is example:

my $text = <<TEXT;
Start Cycle
report 1
report 2
report 3
report 4
End Cycle
TEXT

## find text between all start/end pairs
while ($text =~ m/^Start Cycle$(.*?)^End Cycle$/msg) {
    my $reports_text = $1;
    ## remove leading spaces
    $reports_text =~ s/^\s+//;
    ## split text by newlines
    my @report_parts = split(/\r?\n/m, $reports_text);
}

Answer 3

Perhaps a crazy way to do it: alter Perl's understanding of an input record.

$/ = "End Cycle\n";
print( (/(.+\n)/g)[0,1,3] ) while <$file_handle>;

Answer 4

The regex populates $1, $2, $3 and $4 with the contents of each pair of brackets.

So if you just look at the contents of $1, $2 and $4 you have what you want.

Alternatively you can just leave off the brackets from the third line.

Your regex should look something like

/Start Cycle\n(.+)\n.+\n(.+)\n.+\nEnd Cycle/g

The /g will allow you to evaluate the regex repeatedly and always get the next match every time.

Answer 5

如果您希望保留所有周围的代码不变，但是停止捕获第三件事，则只需删除导致捕获该行的括号：

(Start Cycle .*\n)(.*\n).*\n(.*\n)

Answer 6

I took the OP's question as a Perl exercise and came up with the following code. It was just written for learning purposes. Kindly correct me if anything looks suspicious.

while(<>) {
   if(/Start Cycle/) {
        push @block,$_;
        push @block, scalar<> for 1..3;               
        print @block[0,1,3];
        @block=(); 
           }
        }

Another version (edited and thanks,@FM):

local $/;
$_ = <>;
  @block = (/(Start Cycle\n)(.+\n).+\n(.+\n)/g);
  print @block;

Answer 7

Update: I did not originally notice that this was just @FM's answer in a slightly more robust and longer form.

#!/usr/bin/perl

use strict; use warnings;

{
    local $/ = "End Cycle\n";
    while ( my $block = <DATA> ) {
        last unless my ($heading) = $block =~ /^(Start Cycle\n)/g;
        print $heading, ($block =~ /([^\n]+\n)/g)[1, 3];
    }
}

__DATA__
Start Cycle
report 1
report 2
report 3
report 4
End Cycle

Output:

Start Cycle
report 1
report 3

Answer 8

while (<>) {
    if (/Start Cycle/) {
        print $_;
        $_ = <>;
        print $_;
        $_ = <>; $_ = <>;
        print $_;
    }
}

Omitting or excluding Regular Expression matches from a Perl script

Question

8 answers

solution1
5 ACCPTED 2009-11-25 22:36:40

solution2
2 2009-11-25 22:38:10

solution3
2 2009-11-25 22:55:52

solution4
1 2009-11-25 22:35:48

solution5
1 2009-11-25 22:50:58

solution6
1 2009-11-26 07:47:10

solution7
0 2009-11-25 23:44:50

solution8
0 2009-11-26 00:36:28

Omitting or excluding Regular Expression matches from a Perl script

Question

8 answers

solution1 5 ACCPTED 2009-11-25 22:36:40

solution2 2 2009-11-25 22:38:10

solution3 2 2009-11-25 22:55:52

solution4 1 2009-11-25 22:35:48

solution5 1 2009-11-25 22:50:58

solution6 1 2009-11-26 07:47:10

solution7 0 2009-11-25 23:44:50

solution8 0 2009-11-26 00:36:28

solution1
5 ACCPTED 2009-11-25 22:36:40

solution2
2 2009-11-25 22:38:10

solution3
2 2009-11-25 22:55:52

solution4
1 2009-11-25 22:35:48

solution5
1 2009-11-25 22:50:58

solution6
1 2009-11-26 07:47:10

solution7
0 2009-11-25 23:44:50

solution8
0 2009-11-26 00:36:28