[英]How to match nested repeating groups with regex in Java?
我正在嘗試將重復組與Java相匹配:
String s = "The very first line\n"
+ "\n"
+ "AA (aa)\n"
+ "BB (bb)\n"
+ "CC (cc)\n"
+ "\n";
Pattern p = Pattern.compile(
"The very first line\\s+"
+ "((?<gr1>[a-z]+)\\s+\\((?<gr2>[^)]+)\\)\\s*)+",
Pattern.DOTALL | Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(s);
if (m.find()) {
for (int i = 0; i <= m.groupCount(); i++) {
System.out.println("group #" + i + ": [" + m.group(i).trim() + "]");
}
System.out.println("group gr1: [" + m.group("gr1").trim() + "]");
System.out.println("group gr2: [" + m.group("gr2").trim() + "]");
}
問題在於重復的組:盡管正則表達式匹配整個文本塊(請參見下面的輸出示例中的group #0
),但是在檢索組#2
和#3
(或按名稱gr1
/ gr2
)時,它確實僅返回最后一場比賽( CC/cc
)並跳過前一場比賽( AA/aa
和BB/bb
)
group #0: [The very first line
AA (aa)
BB (bb)
CC (cc)]
group #1: [CC (cc)]
group #2: [CC]
group #3: [cc]
group gr1: [CC]
group gr2: [cc]
有辦法解決嗎?
編輯: The very first line
模式中作為標識字符串-請參閱下面對gknicker答案的評論
好像你想你的模式而不是整個輸入字符串,但只是個別重復節相匹配。 如果是這樣,您的模式將是:
Pattern p = Pattern.compile(
"((?<gr1>[a-z]+)\\s+\\((?<gr2>[^)]+)\\))",
Pattern.CASE_INSENSITIVE);
然后,在這種情況下,您將有一個while
循環來查找每個匹配項:
Matcher m = p.matcher(s);
while (m.find()) {
System.out.println("group gr1: ["
+ m.group("gr1").trim() + "]");
System.out.println("group gr2: ["
+ m.group("gr2").trim() + "]");
}
但是,如果您需要整個比賽,則可能必須使用以下兩種模式:
String s = "The very first line\n"
+ "\n"
+ "AA (aa)\n"
+ "BB (bb)\n"
+ "CC (cc)\n"
+ "\n";
Pattern p = Pattern.compile(
"The very first line\\s+(([a-z]+)\\s+\\(([^)]+)\\)\\s*)+",
Pattern.CASE_INSENSITIVE);
Pattern p2 = Pattern.compile(
"((?<gr1>[a-z]+)\\s+\\((?<gr2>[^)]+)\\))",
Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(s);
while (m.find()) {
Matcher m2 = p2.matcher(m.group());
while (m2.find()) {
System.out.println("group gr1: ["
+ m2.group("gr1").trim() + "]");
System.out.println("group gr2: ["
+ m2.group("gr2").trim() + "]");
}
}
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.