簡體   English   中英

如何在Java中使用正則表達式匹配嵌套重復組?

[英]How to match nested repeating groups with regex in Java?

我正在嘗試將重復組與Java相匹配:

String s = "The very first line\n"
        + "\n"
        + "AA (aa)\n"
        + "BB (bb)\n"
        + "CC (cc)\n"
        + "\n";

Pattern p = Pattern.compile(
        "The very first line\\s+"
        + "((?<gr1>[a-z]+)\\s+\\((?<gr2>[^)]+)\\)\\s*)+",
        Pattern.DOTALL | Pattern.CASE_INSENSITIVE);

Matcher m = p.matcher(s);

if (m.find()) {
    for (int i = 0; i <= m.groupCount(); i++) {
        System.out.println("group #" + i + ": [" + m.group(i).trim() + "]");
    }
    System.out.println("group gr1: [" + m.group("gr1").trim() + "]");
    System.out.println("group gr2: [" + m.group("gr2").trim() + "]");
}

問題在於重復的組:盡管正則表達式匹配整個文本塊(請參見下面的輸出示例中的group #0 ),但是在檢索組#2#3 (或按名稱gr1 / gr2 )時,它確實僅返回最后一場比賽( CC/cc )並跳過前一場比賽( AA/aaBB/bb

group #0: [The very first line

AA (aa)
BB (bb)
CC (cc)]
group #1: [CC (cc)]
group #2: [CC]
group #3: [cc]
group gr1: [CC]
group gr2: [cc]

有辦法解決嗎?

編輯: The very first line模式中作為標識字符串-請參閱下面對gknicker答案的評論

好像你想你的模式而不是整個輸入字符串,但只是個別重復節相匹配。 如果是這樣,您的模式將是:

    Pattern p = Pattern.compile(
        "((?<gr1>[a-z]+)\\s+\\((?<gr2>[^)]+)\\))",
        Pattern.CASE_INSENSITIVE);

然后,在這種情況下,您將有一個while循環來查找每個匹配項:

    Matcher m = p.matcher(s);

    while (m.find()) {
        System.out.println("group gr1: ["
            + m.group("gr1").trim() + "]");
        System.out.println("group gr2: ["
            + m.group("gr2").trim() + "]");
    }

但是,如果您需要整個比賽,則可能必須使用以下兩種模式:

    String s = "The very first line\n"
        + "\n"
        + "AA (aa)\n"
        + "BB (bb)\n"
        + "CC (cc)\n"
        + "\n";

    Pattern p = Pattern.compile(
        "The very first line\\s+(([a-z]+)\\s+\\(([^)]+)\\)\\s*)+",
        Pattern.CASE_INSENSITIVE);

    Pattern p2 = Pattern.compile(
        "((?<gr1>[a-z]+)\\s+\\((?<gr2>[^)]+)\\))",
        Pattern.CASE_INSENSITIVE);

    Matcher m = p.matcher(s);
    while (m.find()) {
        Matcher m2 = p2.matcher(m.group());
        while (m2.find()) {
            System.out.println("group gr1: ["
                + m2.group("gr1").trim() + "]");
            System.out.println("group gr2: ["
                + m2.group("gr2").trim() + "]");
        }
    }

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM