简体   繁体   中英

Perl File Reading and RegEx Matching

I'm doing a little perl script, but i've some problem while reading file, and than iterate over regex.

In particulary the file is over multiple line, and for each line, i need to extract some value, i post an example to understand better.

This is sample line of file

            1A    OCC OCC  4B  5B  6B  7B  8B    9A
      OCC OCC    12B 13B 14B OCC 16B 17B 18B   OCC OCC

i need to match for the first, second, n.., line separetly: 1A 4B 5B 6B 7B ...

excecpt OCC.

i wrote this code:

my $path="file.txt";

open (my $fh, "<", $path);

 while(my $line = <$fh>)
 {
    for ($line =~/(\d{1,2}[A|B|C])/){   
      print " $1";  
 }
}

The result that i obtained is only match for the first occurance matched on the line. 1A 12B

How can i extend for read all line and match the content correctly?

The print result is only for my debugging test.

To match all occurrences of a regex, you need to use the /g modifier.

Also, as the argument to for is evaluated in list context, it would return all the matches at once, so using $1 would return the same value (the last one) for each match; but you can use the loop variable instead:

for ($line =~ /(\d{1,2}[ABC])/g) {
    print " $_";
}

But, it's common to loop over the matches with while instead, as it returns the matching parts one by one, without the need to have a long list of matches. Here, you need $1 , as the loop condition is evaluated in scalar context:

while ($line =~ /(\d{1,2}[ABC])/g) {
    print " $1";
}

Notes: Your input doesn't contain | , so I removed it from the character class.

The match as you wrote it captures once and it stops. So the for loop is over that one number that is inside (line =~ ...) .

You can instead use the /g modifier which will make regex to keep going and find all matches. If you assign that to an array then the operator is in the list context and it returns all matches

my @matches = $line =~ /\d{1,2}[A-C]/g;

Here you don't need the capturing parenthesis since you take the whole match. When in doubt add them. If you simply need any numbers followed by any letters you can use /\\d+\\w+/g instead.

I'd like to make a few more comments.

  • Please always start your programs with use warnings; and use strict;

  • Always, always check calls like open

Altogether

use warnings 'all';
use strict;
use feature qw(say);

my $path="file.txt";

open my $fh, "<", $path  or die "Can't open $path: $!";

while (my $line = <$fh>)
{
    my @matches = $line =~ /(\d{1,2}[A-C])/g;

    say "@matches";
}

close $fh;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM