简体   繁体   English

命名的捕获多次匹配(Perl)

[英]named captures that match more than once (Perl)

When I run this code: 当我运行此代码时:

$_='xaxbxc';
if(/(x(?<foo>.))+/) {
    say "&: ", $&;
    say "0: ", $-{foo}[0];
    say "1: ", $-{foo}[1];
 }

I get: 我明白了:

&: xaxbxc
0: c
1:

I understand that this is how it's supposed to work, but I would like to be able to somehow get the list of all matches ('a', 'b', 'c') instead of just the last match ( c ). 我知道这是它应该如何工作,但我希望能够以某种方式获得所有匹配('a', 'b', 'c')而不仅仅是最后一个匹配( c )。 How can I do this? 我怎样才能做到这一点?

I don't think there is a way to do this in general (please correct me if I am wrong), but there is likely to be a way to accomplish the same end-goal in specific situations. 我认为通常没有办法做到这一点(如果我错了,请纠正我),但在特定情况下可能有办法实现相同的最终目标。 For example, this would work for your specific code sample: 例如,这适用于您的特定代码示例:

$_='xaxbxc';
while (/x(?<foo>.)/g) {
    say "foo: ", $+{foo};
}

What exactly are you trying to accomplish? 你到底想要完成什么? Perhaps we could find a solution for your actual problem even if there is no way to do repeating captures. 即使没有办法重复捕获,也许我们可以为您的实际问题找到解决方案。

Perl allows a regular expression to match multiple times with the "g" switch past the end. Perl允许正则表达式多次与“g”开关匹配。 Each individual match can then be looped over, as described in the Global Matching subsection of the Using Regular Expressions in Perl section of the Perl Regex Tutorial : 然后可以循环每个单独的匹配,如Perl Regex教程中使用Perl中的正则表达式部分的全局匹配子部分所述

while(/(x(?<foo>.))+/g){
    say "&: ", $&;
    say "foo: ", $+{foo};
}

This will produce an iterated list: 这将产生一个迭代列表:

&: xa
foo: a
&: xb
foo: b
&: xc
foo: c

Which still isn't what you want, but it's really close. 哪个仍然不是你想要的,但它真的很接近。 Combining a global regex (/g) with you previous local regex probably will do it. 将全局正则表达式(/ g)与之前的本地正则表达式相结合可能会做到这一点。 Generally, make a capturing group around your repeated group, then re-parse just that group with a global regex that represents just a single iteration of that group, and iterate over it or use it as a list. 通常,在重复的组周围创建一个捕获组,然后使用仅表示该组的单个迭代的全局正则表重新解析该组,并迭代它或将其用作列表。

It looks like a question fairly similar to this one- at least in answer, if not forumlation- has been answered by someone much more competent at Perl than I: "Is there a Perl equivalent of Python's re.findall/re.finditer (iterative regex results)?" 它看起来像一个与这个问题非常类似的问题 - 至少在答案中,如果没有论坛化 - 已经被Perl比我更有能力的人回答: “是否有Perl相当于Python的re.findall / re.finditer(迭代)正则表达式结果)?“ You might want to check the answers for that as well, with more details about the proper use of global regexes. 您可能还想查看相关答案,并提供有关正确使用全局正则表达式的更多详细信息。 (Perl isn't my language, I just have an unhealthy appreciation for regular expressions.) (Perl不是我的语言,我对正则表达式不满意。)

In situations like these, using embeded code blocks provides an easy way out: 在这种情况下,使用嵌入式代码块提供了一个简单的方法:

my @match;
$_='xaxbxc';
if(/((?:x(.)(?{push @match, $^N}))+)/) {
    say "\$1: ", $1;
    say "@match"
}

which prints: 打印:

$1: xaxbxc
a b c

The %- variable is used when you have more than one of the same named group in the same pattern, not when the a given group happens to be iterated. 如果在同一模式中有多个相同的命名组,则使用%-变量,而不是在给定的组碰巧迭代时。

That's why /(.)+/ doesn't load up $1 with each separate character, just with the last one. 这就是为什么/(.)+/不会加载每个单独字符$1 ,只是最后一个。 Same with /(<x>.)+/ . /(<x>.)+/相同。 However, with /(<x>.)(<x>.)/ you have two different <x> groups, so $-{x} . 但是,使用/(<x>.)(<x>.)/你有两个不同的<x>组,所以$-{x} Consider: 考虑:

% perl -le '"foobar" =~ /(?<x>.)(?<x>.)/; print "x#1 is $-{x}[0], x#2 is $-{x}[1]"'
x#1 is f, x#2 is o

% perl -le '"foobar" =~ /(?:(?<x>.)(?<x>.))+/; print "x#1 is $-{x}[0], x#2 is $-{x}[1]"'
x#1 is a, x#2 is r

I'm not sure that is exactly what you're looking for, but the following code should do the trick. 我不确定这正是您正在寻找的,但以下代码应该可以解决问题。

$_='xaxbxc';
@l = /x(?<foo>.)/g;

print join(", ", @l)."\n";

But, I'm not sure this would work with overlapping strings. 但是,我不确定这会对重叠的字符串起作用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM