简体   繁体   中英

how to loop through an array to find more than one pattern using perl regex?

I'm trying to find two patterns within an array and put the results into another array.

For example

  $/ = "__Data__";

  __Data__
  #SCSI_test         # put this line into  @arrayNewLines      
  kdkdkdkdkdkdkdkd
  dkdkdkdkdkdkdkdkd
  - ccccccccccccccc  # put this line into @arrayNewLines

Code

    while(<FILEREAD>)
    {
          chomp;
          my @arrayOld = split(\n,@array);

          foreach my $i (0 .. $#arrayOld)
          {
                if($arrayOld[$i] =~ /^-(.*)/g or /\#(.*)/g)
                {
                     my @arrayNewLines = $arrayOld[$i];
                     print "@arrayNewLines\n";
                }
          }
    }

This code only prints out only ccccccccccccccc But I would like it to output ccccccccccccccc #SCSI_test

That code does not print just cccccc... , it prints everything. Your problem is this line:

if($arrayOld[$i] =~ /^-(.*)/g or /\#(.*)/g) {

What you are doing here is first checking $arrayOld[$i] and then checking $_ , because /\\#(.*)/ is perl shorthand for $_ =~ /\\#(.*)/ . Since the line contains a hash character # , it will always match, and the line will always print.

Your line is equivalent to:

if(   $arrayOld[$i] =~ /^-(.*)/g 
      or 
      $_ =~ /\#(.*)/g) {

The answer there is to join the regexes:

if($arrayOld[$i] =~ /^-|#/) {

However, your code is far from clean after that... starting from the top:

If you set the input record separator $/ to __Data__ with that input, you will get two records ( Data::Dumper output shown below):

$VAR1 = '__Data__';
$VAR1 = '
#SCSI_test         # put this line into  @arrayNewLines
kdkdkdkdkdkdkdkd
dkdkdkdkdkdkdkdkd
- ccccccccccccccc  # put this line into @arrayNewLines
';

When you chomp the records, you will remove __Data__ from the end, so the first line will become empty. So in essence, you will always have a leading empty field. This is nothing horrible, but something to remember.

Your split statement is wrong. First off, the first argument should be a regex: /\\n/ . The second argument should be a scalar, not an array. split(/\\n/,@array) will evaluate to split(/\\n/, 2) , because the array is in scalar context and returns its size instead of its elements.

Also, of course, since you are in a loop reading lines from the FILEREAD handle, that @array array will always contain the same data, and has nothing to do with the data from the file handle. What you want is: split /\\n/, $_ .

This loop:

foreach my $i (0 .. $#arrayOld) {

is not a very good loop structure for this problem. Also, there is no need to use an intermediate array. Just use:

for my $line (split /\n/, $_) {

When you do

my @arrayNewLines = $arrayOld[$i];
print "@arrayNewLines\n";

You are setting the entire array to a scalar, then printing it, which is completely redundant. You get the same effect just printing the scalar directly.

Your code should look like this:

while(<FILEREAD>) {
    chomp;
    foreach my $line (split /\n/, $_) {
        if($line =~ /^-|#/) {
            print "$line\n";
        }
    }
}

It is also recommended that you use lexical file handles, so instead of

open FILEREAD, "somefile" or die $!;       # read with <FILEREAD>

use:

open my $fh, "<", "somefile" or die $!;    # read with <$fh>
#! /usr/bin/env perl

use strict;
use warnings;

*ARGV = *DATA;

my @arrayNewLines;

while (<>) {
  chomp;

  if (/^-(.*)/ || /\#(.*)/) {
    push @arrayNewLines, $_;
  }
}

print "$_\n" for @arrayNewLines;

__DATA__
#SCSI_test         # put this line into  @arrayNewLines
kdkdkdkdkdkdkdkd
dkdkdkdkdkdkdkdkd
- ccccccccccccccc  # put this line into @arrayNewLines

Even better, if you have 5.10 or newer, use smart matching.

#! /usr/bin/env perl

use strict;
use warnings;

use 5.10.0;  # for smart matching

*ARGV = *DATA;

my @arrayNewLines;

my @patterns = (qr/^-(.*)/, qr/\#(.*)/);

while (<>) {
  chomp;
  push @arrayNewLines, $_ if $_ ~~ @patterns;
}

print "$_\n" for @arrayNewLines;

__DATA__
#SCSI_test         # put this line into  @arrayNewLines
kdkdkdkdkdkdkdkd
dkdkdkdkdkdkdkdkd
- ccccccccccccccc  # put this line into @arrayNewLines

Either way, the output is

#SCSI_test         # put this line into  @arrayNewLines
- ccccccccccccccc  # put this line into @arrayNewLines

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM