简体   繁体   English

如何在Perl的正则表达式中访问命名捕获组中的值?

[英]How can I access the value in a named capture group in a regex in perl?

I'm trying to access the captured data that was captured in a named capture group called as a subroutine: 我正在尝试访问在名为子例程的命名捕获组中捕获的捕获数据:

use strict;
use warnings;
"this is a test" =~ /(?!)
(?<isa>is\s+a)
| (?&isa)\s
(?<test>test)/x;
print "isa: $+{isa}\ntest: $+{test}"

And here's another attempt: 这是另一种尝试:

use strict;
use warnings;
"this is a test" =~ /(?!)
(?<isa_>(?<isa>is\s+a))
| (?&isa_)\s
(?<test>test)/x;
print "isa: $+{isa}\ntest: $+{test}"

I can't seem to get $+{isa} to be populated. 我似乎无法填充$ + {isa}。 Why is that and how do I do so? 为什么会这样,我该怎么做?

Since you force the first branch to fail with (?!) , the named capture group (?<isa>...) that is defined after doesn't capture anything (but is defined as a subpattern). 由于您用(?!)强制第一个分支失败,因此之后定义的命名捕获组(?<isa>...)不会捕获任何内容(而是定义为子模式)。

Only the second branch succeeds, but this one doesn't capture anything for the group "isa", it only uses the subpattern alias (?&isa_) . 仅第二个分支成功,但是该分支不捕获组“ isa”的任何内容,它仅使用子模式别名(?&isa_)

Your first example returns the warning: 您的第一个示例返回警告:

Reference to nonexistent named group in regex

since "isa_" is defined nowhere. 因为“ isa_”在任何地方都没有定义。

[EDIT] you have changed your "isa_" to "isa" in your first example, but with this new version, there's no reason anything will be captured in the "isa" named group. [EDIT]在第一个示例中,您已将“ isa_”更改为“ isa”,但是使用此新版本,没有任何理由会在名为“ isa”的组中捕获任何内容。

Your second example will not populate "isa" too, because the capture groups captures things only where they are defined, not elsewhere (even if isa_ refers to the group isa .) 您的第二个示例也不会填充“ isa”,因为捕获组仅捕获定义的位置处的内容,而不捕获其他位置的内容(即使isa_指向组isa

The reason is that Perl doesn't store captures in a recursion (only captures at the ground level are kept) . 原因是Perl不在递归中存储捕获(仅保留底层的捕获) You can test it with this example: 您可以使用以下示例进行测试:

"this is a test" =~ /
  (?!)
  (?<isa_>
      (?<isa> is \s+ a)
      (?{print "isa in recursion: $+{isa}\n"})
  )
|
  (?&isa_) \s (?<test> test )
/x;

print "isa: $+{isa}\ntest: $+{test}"

However, you can write: 但是,您可以编写:

"this is a test" =~ /
  (?!) (?<isa_> is \s+ a )
|
  (?<isa> (?&isa_) ) \s (?<test> test )
/x;

print "isa: $+{isa}\ntest: $+{test}";

But here, the named capture "isa" is at the ground level. 但是在这里,命名的捕获“ isa”在地面上。


Note: instead of using (?!) to make the pattern fail and an alternation, you can use the (?(DEFINE)...) syntax: 注意:您可以使用(?(DEFINE)...)语法来代替使用(?!)来使模式失败和交替显示:

/(?(DEFINE)
     (?<isa_> (?<isa> is \s+ a) )
 )
 (?&isa_) \s (?<test> test )
/x

or this one: 或这一个:

/(?<isa_> (?<isa> is \s+ a) ){0}
 (?&isa_) \s (?<test> test )
/x

In this way you avoid the cost of an alternation. 通过这种方式,您可以避免更改的成本。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM