简体   繁体   English

Java 7中名为捕获组的正则表达式支持

[英]Regular Expression named capturing groups support in Java 7

Since Java 7 regular expressions API offers support for named capturing groups. 由于Java 7正则表达式API提供了对命名捕获组的支持。 The method java.util.regex.Matcher.group(String) returns the input subsequence captured by the given named-capturing group, but there's no example available on API documentations. 方法java.util.regex.Matcher.group(String)返回由给定的命名捕获组捕获的输入子序列,但是API文档上没有可用的示例。

What is the right syntax to specify and retrieve a named capturing group in Java 7? 在Java 7中指定和检索命名捕获组的正确语法是什么?

Specifying named capturing group 指定命名的捕获组

Use the following regex with a single capturing group as an example ([Pp]attern) . 使用以下带有单个捕获组的正则表达式作为示例([Pp]attern)

Below are 4 examples on how to specify a named capturing group for the regex above: 以下是有关如何为上述正则表达式指定命名捕获组的 4个示例:

(?<Name>[Pp]attern)
(?<group1>[Pp]attern)
(?<name>[Pp]attern)
(?<NAME>[Pp]attern)

Note that the name of the capturing group must strictly matches the following Pattern: 请注意, 捕获组名称必须严格匹配以下模式:

[A-Za-z][A-Za-z0-9]*

The group name is case-sensitive, so you must specify the exact group name when you are referring to them (see below). 组名区分大小写,因此在引用它们时必须指定确切的组名(请参阅下文)。

Backreference the named capturing group in regex 在正则表达式中反向引用命名的捕获组

To back-reference the content matched by a named capturing group in the regex (correspond to 4 examples above): 要在正则表达式中反向引用与命名捕获组匹配的内容 (对应于上面的4个示例):

\k<Name>
\k<group1>
\k<name>
\k<NAME>

The named capturing group is still numbered, so in all 4 examples, it can be back-referenced with \\1 as per normal. 命名的捕获组仍在编号,因此在所有4个示例中,可以按常规使用\\1对其进行反向引用。

Refer to named capturing group in replacement string 请参考替换字符串中的命名捕获组

To refer to the capturing group in replacement string (correspond to 4 examples above): 在替换字符串中引用捕获组 (对应于上面的4个示例):

${Name}
${group1}
${name}
${NAME}

Same as above, in all 4 examples, the content of the capturing group can be referred to with $1 in the replacement string. 与上述相同,在所有四个示例中,可以在替换字符串中以$1引用捕获组的内容。

Named capturing group in COMMENT mode COMMENT模式命名的捕获组

Using (?<name>[Pp]attern) as an example for this section. 本节以(?<name>[Pp]attern)为例。

Oracle's implementation of the COMMENT mode (embedded flag (?x) ) parses the following examples to be identical to the regex above: Oracle对COMMENT模式的实现(嵌入式标志(?x) )将以下示例解析为与上述正则表达式相同:

(?x)  (  ?<name>             [Pp] attern  )
(?x)  (  ?<  name  >         [Pp] attern  )
(?x)  (  ?<  n  a m    e  >  [Pp] attern  )

Except for ?< which must not be separated, it allows arbitrary spacing even in between the name of the capturing group. 除了不能分隔的?<之外,它甚至允许在捕获组的名称之间允许任意间隔。

Same name for different capturing groups? 不同的捕获组名称相同吗?

While it is possible in .NET, Perl and PCRE to define the same name for different capturing groups, it is currently not supported in Java (Java 8). 尽管在.NET,Perl和PCRE中可以为不同的捕获组定义相同的名称,但Java(Java 8)当前不支持该名称。 You can't use the same name for different capturing groups. 不同的捕获组不能使用相同的名称。

Named capturing group related APIs 命名与捕获组相关的API

New methods in Matcher class to support retrieving captured text by group name: Matcher类中的新方法支持按组名检索捕获的文本:

The corresponding method is missing from MatchResult class as of Java 8. There is an on-going Enhancement request JDK-8065554 for this issue. 从Java 8开始, MatchResult类中缺少相应的方法。针对此问题,正在进行的增强请求JDK-8065554

There is currently no API to get the list of named capturing groups in the regex. 当前没有任何API可用于获取正则表达式中命名捕获组的列表。 We have to jump through extra hoops to get it . 我们必须跳过额外的障碍才能做到 Though it is quite useless for most purposes, except for writing a regex tester. 尽管对于编写大多数正则表达式没有用,除了编写正则表达式测试器之外。

The new syntax for a named capturing group is (?<name>X) for a matching group X named by "name". 对于以“ name”命名的匹配组X,命名捕获组的新语法为(?<name>X) The following code captures the regex (\\w+) (any group of alphanumeric characters). 以下代码捕获正则表达式(\\ w +)(任何字母数字字符组)。 To name this capturing group you must add the expression ? 要命名此捕获组,必须添加表达式? inside the parentheses just before the regex to be captured. 在正则表达式之前的括号内。

Pattern compile = Pattern.compile("(?<teste>\\w+)");
Matcher matcher = compile.matcher("The first word is a match");
matcher.find();
String myNamedGroup= matcher.group("teste");
System.out.printf("This is yout named group: %s", myNamedGroup);

This code returns prints the following output: 此代码返回输出以下输出:

This is your named group: The 这是您的命名群组:

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM