简体   繁体   English

如何在Java中的匹配中获取名为捕获组的正则表达式的名称?

[英]How to get the names of the regex named capturing group in a match in Java?

Given: 鉴于:

String text = "FACEBOOK is buying GOOGLE and FACE BOOK";

and: 和:

Pattern pattern = Pattern.compile("(?<FB>(FACE(\\p{Space}?)BOOK))|(?<GOOGL>(GOOGL(E)?))");
Matcher matcher = pattern.matcher(text);

I want to get something like this: 我想得到这样的东西:

Group=FB matches substring="FACEBOOK" at position=[0, 8)
Group=GOOGL matches substring="GOOGLE" at position=[19, 25)
Group=FB matches substring="FACE BOOK" at position=[30, 39)

However, I have been unable to get the group name. 但是,我一直无法获得组名。 Here is my attempt in Scala: 这是我在Scala中的尝试:

import java.util.regex.Pattern
  val pattern = Pattern.compile("(?<FB>(FACE(\\p{Space}?)BOOK))|(?<GOOGL>(GOOGL(E)?))")
  val text = "FACEBOOK is buying GOOGLE and FACE BOOK"
  val matcher = pattern.matcher(text)

  while(matcher.find()) {
    println(s"Group=???? matches substring=${matcher.group()} at position=[${matcher.start},${matcher.end})")
  }

EDIT: Someone marked this as a duplicate of Get group names in java regex but this is a different question. 编辑:有人在Java regex中将其标记为“ 获取组名”的重复项,但这是一个不同的问题。 This is asking given a MATCH, how to find the group name. 这是在给定MATCH的情况下询问如何找到组名。 The other question is asking how to get the group-name to String (or index) given a Pattern object. 另一个问题是询问如何在给定Pattern对象的情况下将组名获取为String(或索引)。

Here is my attempt in Scala: 这是我在Scala中的尝试:

import java.util.regex.{MatchResult, Pattern}

class GroupNamedRegex(pattern: Pattern, namedGroups: Set[String]) {
  def this(regex: String) = this(Pattern.compile(regex), 
    "\\(\\?<([a-zA-Z][a-zA-Z0-9]*)>".r.findAllMatchIn(regex).map(_.group(1)).toSet)

  def findNamedMatches(s: String): Iterator[GroupNamedRegex.Match] = new Iterator[GroupNamedRegex.Match] {
    private[this] val m = pattern.matcher(s)
    private[this] var _hasNext = m.find()

    override def hasNext = _hasNext

    override def next() = {
      val ans = GroupNamedRegex.Match(m.toMatchResult, namedGroups.find(group => m.group(group) != null))
      _hasNext = m.find()
      ans
    }
  }
}

object GroupNamedRegex extends App {
  case class Match(result: MatchResult, groupName: Option[String])

  val r = new GroupNamedRegex("(?<FB>(FACE(\\p{Space}?)BOOK))|(?<GOOGL>(GOOGL(E)?))")
  println(r.findNamedMatches("FACEBOOK is buying GOOGLE and FACE BOOK FB").map(s => s.groupName -> s.result.group()).toList)
}

You could use the named-regexp Java library. 您可以使用named-regexp Java库。 It is a thin wrapper around java.util.regex with named capture groups support, primarily for pre-Java-7 users, but it also contains the methods to inspect the group names (which appears to be missing even from Java 11): 它是围绕java.util.regex一个瘦包装,主要为Java-7之前的用户提供命名捕获组支持,但是它还包含检查组名的方法(即使Java 11似乎也没有):

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM