简体   繁体   中英

Search array with regex Perl

my @array = ('Joe','Jim','Jim_BOB','Hello');
$search = "Joe";
$search2 = "Hello";
$search3 = "Jim";
$search4 =~ qw/.*?_.*?/;

my %index;
@index{@array} = (0..$#array);
my $index = $index{$search};
my $index2 = $index{$search2};
my $index3 = $index{$search3};
my $index4 = $index{$search4};
print $index,",",$index2,",",$index3,",",$index4, "\n";

This returns 0,3,1 which are the indexes of the $search terms in @array. The index will not recognize $search4 however becasuse it is a regex. My question is, how do I search @array with regex?

qw is used to quote lists of words, to store a regex in a variable, it's better to use qr :

my $search4 = qr/_/; # the leading and trailing '.*?' are redundant 

Get a single arbitrary matching index:

my ($index4) = grep $array[$_] =~ /$search4/, 0..$#array; 

Or all of them:

my @i = grep $array[$_] =~ /$search4/, 0..$#array;

Your current approach using a hash will only return the last matching index if your array contains duplicate elements. Other answers have shown how you can fix your existing code, but to allow for duplicate elements, you can use List::MoreUtils .

The following shows how to get the first and last matching indexes for both a fixed search string and a regex, as well as how to get all matching indexes:

use strict;
use warnings;
use 5.010;

use List::MoreUtils qw(first_index last_index indexes);

my @words = qw(Joe Jim Jim_BOB Hello Jim Hello Jim);

my $string = 'Jim';
my $regex = '^J';

say "First $string: " . first_index { $_ eq $string } @words;
say "Last $string: " . last_index { $_ eq $string } @words;
say "All $string: " . join ', ', indexes { $_ eq $string } @words;

say "First regex: " . first_index { /$regex/ } @words;
say "Last regex: " . last_index { /$regex/ } @words;
say "All regex: " . join ', ', indexes { /$regex/ } @words;

Output:

First Jim: 1
Last Jim: 6
All Jim: 1, 4, 6
First regex: 0
Last regex: 6
All regex: 0, 1, 2, 4, 6

In your code there is the unrelated problem, in that $search4 is not a reqex. $search4 =~ qw/.*?_.*?/; means that you are matching the undefined variable $search4 against qw/.*?_.*?/; . qw is basically splitting a string on whitespace. In this case there is no whitespace and thus you are matching against the string .*?_.*? . In void context this has no effect at all, $search4 is left undefined.

With use strict; use warnings; use strict; use warnings; and after declaring your variables you would have gotten an appropriate error.

$ cat t1.pl 
use strict;
use warnings;

my @array = ('Joe','Jim','Jim_BOB','Hello');
my $search = "Joe";
my $search2 = "Hello";
my $search3 = "Jim";
my $search4 =~ qw/.*?_.*?/;

my %index;
@index{@array} = (0..$#array);
my $index = $index{$search};
my $index2 = $index{$search2};
my $index3 = $index{$search3};
my $index4 = $index{$search4};
print $index,",",$index2,",",$index3,",",$index4, "\n";

$ perl t.pl 
Use of uninitialized value $search4 in pattern match (m//) at t.pl line 8.
Use of uninitialized value $search4 in hash element at t.pl line 15.
Use of uninitialized value $index4 in print at t.pl line 16.
0,3,1,

I assume that you meant $search4 = qr/.*?_.*?/ .

One solution to your problem would be to treat the regexp as a special case and to loop over your array.

$ cat t2.pl 
use strict;
use warnings;

my @array = ('Joe','Jim','Jim_BOB','Hello');
my $search = 'Joe';
my $search2 = 'Hello';
my $search3 = 'Jim';
my $search4 = qr/.*?_.*?/;

my %index;
@index{@array} = (0..$#array);
my $index = $index{$search};
my $index2 = $index{$search2};
my $index3 = $index{$search3};

# loop over the array until a match is found
my $cnt = 0;
my $index4;
for my $elem ( @array ) {
    if ( $elem =~ $search4 ) {
        $index4 = $cnt;
        last;
    }
    $cnt++;
}

print "$index,$index2,$index3,$index4\n";

$ perl t2.pl 
0,3,1,2

If you want to use a lookup hash, then you might like the module Tie::Hash::Regex from CPAN.

$ cat t3.pl 
use strict;
use warnings;

# modules from CPAN
use Tie::Hash::Regex;

my @array = ('Joe','Jim','Jim_BOB','Hello');
my $search = "Joe";
my $search2 = "Hello";
my $search3 = "Jim";
my $search4 = qr/.*?_.*?/;

my %index;
tie %index, 'Tie::Hash::Regex';
@index{@array} = (0..$#array);
my $index = $index{$search};
my $index2 = $index{$search2};
my $index3 = $index{$search3};
my $index4 = $index{$search4};
print "$index,$index2,$index3,$index4\n";
bernhard@bernhard-Aspire-E1-572:~/devel/StackOverflow$ perl t3.pl 
0,3,1,2

Beware that there are some downsides to that solution. If multiple keys are matching, than you have no guarantee which of the matching keys you will get. And if you pass a string that does not look like a regexp, it will still be considered a regexp.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM