I can see from this answer that if I do
sub match_all_positions {
my ($regex, $string) = @_;
my @ret;
while ($string =~ /$regex/g) { push @ret, $-[0] }
return @ret
}
print join ',', match_all_positions('0{3}', '001100010000');
I get
4,8
What do I need to do to get the indexes of all matches, even when the overlap, such as positions 8 and 9 in the example above?
I can do
sub match_all_positions_b {
my ($substr, $string) = @_;
return unless index($string, $substr) > 0;
my @res;
my $i = 0;
while ($i <= (length($string) - $length)) {
$i = index($string, $substr, $i);
last if $i < 0;
push @res, $i++;
}
return @res;
}
print join ',', match_all_positions_b('000', '001100010000');
which just lets me match a substring, or
sub match_all_positions_c {
my ($substr, $string) = @_;
my $re = '^' . $substr;
my @res;
for (0..(length($string) - $length)) {
push @res, $_ if substr($string, $_) =~ /$re/;
}
return @res;
}
print join ',', match_all_positions_c('0{3}', '001100010000');
Which is twice as slow.
is there a way to get all matches, even when they overlap? Or should I just take the speed loss because it's inherent to using regex matches?
You need to update your regex for zero-width look-ahead matching.
Try calling your function like this:
print join ',', match_all_positions('(?=0{3})', '001100010000');
If you want to find the positions at which it matches:
my @matches;
push @matches, "$-[1]:$+[1]" while "aabbcc" =~ /(?=(a.*c))/sg;
Output:
0:6
1:6
If you want all possible matches,
local our @matches;
"aabbcc" =~ /(a.*?c)(?{ push @matches, "$-[1]:$+[1]" })(?!)/s;
Output:
0:5
0:6
1:5
1:6
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.