简体   繁体   中英

Perl Regex Digit

I have a question/problem regarding my regex. The code section is below:

use strict;

my @list = ("1", "2", "123");

&chk(@list);

sub chk {
    my @num = split (" ", "@_");
    foreach my $chk (@num) {
        chomp $chk;
        if ($chk =~ m/\d{1,2}?/) {
            print "$chk\n";
        }
    }
}

The \\d{4} will print nothing. The \\d{3} will print only 123 . But if I change to \\d{1,2}? it will print all. I thought, according to all the sources I read so far, that {1,2} mean: one digit but no more than two. So it should have printed only 1 and 2 , correct? What do I need to extract items that contains ONLY one to two digits? Thanks for any help.

\\d{1,2} succeeds if it finds 1 or 2 digits anywhere in the string provided. Additional string content is does not cause the match to fail. If you want to match only when the string contains exactly 1 or 2 digits, do this: ^\\d{1,2}$

You should anchor your regular expression for the desired effect. The built-in function grep suits better here since it is a selection from an array that is to be done:

#!/usr/bin/env perl

use strict;
use warnings;

my @list = ( 1, 2, 123 );
print join "\n", grep /^\d{1,2}$/, @list;

It appears to be working perfectly!

Here's a hint: Use the Perl variables $` , $& , and $' . These variables are special regular expression variables that show the part of the string before the match, what was matched, and the post matched string.

Here's a sample program:

#! /usr/bin/env perl
use strict;
use warnings;
use feature qw(say);
use Scalar::Util;

my @list = ("1", "2", "123");

foreach my $string (@list) {
    if ($string =~ /\d{1,2}?/) {
        say qq(We have a match for "string"!);
        say qq("$`"  "$&"  "$'");
    }
    else {
        say "No match makes David Sad";
    }
}

The output will be:

We have a match for "1"!
""  "1"  ""
We have a match for "2"!
""  "2"  ""
We have a match for "123"!
""  "1"  "23"

What this does is divide up the string into three sections: The section of the string before the regular expression match, the section of the string that matches the regular expression, and the section of the string after the regular expression match.

In each case, there was no pre-match because the regular expression matches from the start of the string. We also see that \\d{1,2}? matches a single digit in each case even through 123 could have matched two digits. Why? Because the question mark on the end of the match specifier tells the regular expression not to be greedy. In this case, we tell the regular expression to match either one or two characters. Fine, it matches on one. Remove the question mark, and the last line would have looked like this:

We have a match for "123"!
""  "12"  "3"

If you want to match on one or two digits, but not three or more digits, you'll have to specify the part of your string before and after the one or two digits. Something like this:

/\D\d{1,2}\D/

This would match your string foo12bar , but not foo123bar . But what if the string is 12 ? In that case, we want to say that either we have the beginning of the string, or a non-digit before our one or two character match, and we either have a non-digit or the end of the string at the end of our one or two character match:

/(\D|^)\d{1,2}(/D|$)/

A quick explanation:

  • (\\D|^) : A non-digit or the beginning of the string (The ^ anchor)
  • d{1,2} : One or two digits
  • (\\D|$) : A non-digit or the end of the string (The $ anchor)

Now, this will match 12 , but not 123 , and it will match foo12 and foo12bar , but not foo123 or foo123bar .

Just looking for a one or two digit number, we can simply specify the anchors:

/^\d{1,2}$/;

Now, that will match 1 , 12 , but not foo12 or 123 .

The main thing is to use the $` , $& , and $' variables in order to help see exactly what your regular expression is matching on and what's before and after your match.

No, because while the regex only matches two digits, $chk still contains 123 . If you want to only print the part that is matched, use

if ($chk =~ m/(\d{1,2})/) {
    print "$1\n";
}

Note the parentheses and the $1. This causes it to print only that which is in the parentheses.

Also, this code doesn't make much sense:

sub chk {
    my @num = split (" ", "@_");

Because @_ already is an array it makes no sense to make it into a string and then split it. Simply do:

sub chk {
    foreach my $chk (@_) {

You also do not need to use chomp for data that is not coming from user input, as it is intended to remove the trailing newline. There is no newline in any of this data.

#!/usr/bin/perl
use strict;

my @list = ("1", "2", "123");

&chk(\@list);

sub chk {

    foreach my $chk (@{$_[0]}) {
        print "$chk\n" if $chk =~ m/^\d{1,2}$/ ;        
    }
}
#!/usr/bin/perl
use strict;
use warnings;
my @list = ("1", "2", "123");

&chk(@list);

sub chk {
my @num = split (" ", "@_");
foreach my $chk (@num) {
    chomp $chk;
    if ($chk =~ m/\d{1,2}/ && length($chk) <= 2) {
        print "$chk\n";
    }
}
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM