我如何知道 Perl 正则表达式的哪一部分与字符串匹配？

Question

我想搜索文件的行以查看它们是否与一组正则表达式中的一个匹配。

像这样的东西：

my @regs = (qr/a/, qr/b/, qr/c/);
foreach my $line (<ARGV>) {
   foreach my $reg (@regs) {
      if ($line =~ /$reg/) {
         printf("matched %s\n", $reg);
      }
   }
}

但这可能很慢。

似乎正则表达式编译器可以提供帮助。 有没有这样的优化：

my $master_reg = join("|", @regs); # this is wrong syntax. what's the right way?
foreach my $line (<ARGV>) {
   $line =~ /$master_reg/;
   my $matched = special_function();
   printf("matched the %sth reg: %s\n", $matched, $regs[$matched]
}

}

其中 'special_function' 是告诉我正则表达式的哪个部分匹配的特殊酱汁。

Answer 1

使用捕获括号。 基本思路如下：

my @matches = $foo =~ /(one)|(two)|(three)/;
defined $matches[0]
    and print "Matched 'one'\n";
defined $matches[1]
    and print "Matched 'two'\n";
defined $matches[2]
    and print "Matched 'three'\n";

Answer 2

添加捕获组：

"pear" =~ /(a)|(b)|(c)/;
if (defined $1) {
    print "Matched a\n";
} elsif (defined $2) {
    print "Matched b\n";
} elsif (defined $3) {
    print "Matched c\n";
} else {
    print "No match\n";
}

显然，在这个简单的示例中，您也可以使用/(a|b|c)/并仅打印$1 ，但是当 'a'、'b' 和 'c' 可以是任意复杂的表达式时，这是一个胜利。

如果您以编程方式构建正则表达式，您可能会发现必须使用编号变量很痛苦，因此不要打破严格性，而是查看@-或@+ arrays，其中包含每个匹配 position 的偏移量。 只要模式完全匹配， $-[0]总是设置，但是如果第n个捕获组匹配，更高的$-[$n]将只包含定义的值。

我如何知道 Perl 正则表达式的哪一部分与字符串匹配？

问题描述

2 个解决方案

解决方案1
8 已采纳 2011-07-15 00:24:29

解决方案2
5 2011-07-15 00:27:22

我如何知道 Perl 正则表达式的哪一部分与字符串匹配？

问题描述

2 个解决方案

解决方案1 8 已采纳 2011-07-15 00:24:29

解决方案2 5 2011-07-15 00:27:22

解决方案1
8 已采纳 2011-07-15 00:24:29

解决方案2
5 2011-07-15 00:27:22