簡體   English   中英

從Java多行文本中提取鍵/值對

[英]Extracting key-value pairs from multiline text in java

考慮以下多行字符串:

This is multiline text that needs to be correctly parsed into key-value pairs, excluding all other information.

 Section One:
    First key = Value One
    Second key = Value Two

 Section Two:   
    Third key = Value Three
    Fourth key = Value Four
    Fifth key = Value Five

 Section Three:
    Sixth key = Value Six
    Seventh key = Value Seven
    Eighth key = Value Eight

換句話說,文本由一個“簡介”(一些短語)組成,然后是多行,每節各有一部分,每個行都有一個“標題”(例如,“ Section One )和多個鍵值對,並以=

鍵可以包含除換行和=之外的任何字符,並且值可以包含除換行之外的任何字符。

有時,其他不相關的行可能會出現在文本中。

需要一個正則表達式,該matched.find()將使matched.find()返回所有鍵值對組,並且僅返回那些鍵值對組,同時跳過簡介和節標題以及任何其他沒有鍵值對的行。

理想情況下,不需要其他文本預處理或后處理。

在此用例中,不能逐行讀取文本並進行相應處理。

(?:\\r|\\n)(\\s*[^=\\.]+)\\s*=\\s*(.+)接近,但它們仍然包含更多要求。

有任何想法嗎?

你快到了 。 只需將\\s*更改為<space>*因為\\s也會匹配換行符。

(?:\r|\n) *([^\n=\.]+)(?<=\S) *= *(.+)

如果包含選項卡,則將上面的space*更改為[ \\t]* (?<=\\S)正向后看,它斷言匹配必須以非空格字符開頭。

DEMO

String s = "This is multiline text that needs to be correctly parsed into key-value pairs, excluding all other information.\n" + 
        "\n" + 
        " Section One:\n" + 
        "    First key = Value One\n" + 
        "    Second key = Value Two\n" + 
        "\n" + 
        " Section Two:   \n" + 
        "    Third key = Value Three\n" + 
        "    Fourth key = Value Four\n" + 
        "    Fifth key = Value Five\n" + 
        "\n" + 
        " Section Three:\n" + 
        "    Sixth key = Value Six\n" + 
        "    Seventh key = Value Seven\n" + 
        "    Eighth key = Value Eight";
Matcher m = Pattern.compile("(?:\\r|\\n)[\\t ]*([^\\n=\\.]+)(?<=\\S)[\\t ]*=[\\t ]*(.+)").matcher(s);
while(m.find())
{
    System.out.println("Key : "+m.group(1) + " => Value : " + m.group(2));
}

輸出:

Key : First key => Value : Value One
Key : Second key => Value : Value Two
Key : Third key => Value : Value Three
Key : Fourth key => Value : Value Four
Key : Fifth key => Value : Value Five
Key : Sixth key => Value : Value Six
Key : Seventh key => Value : Value Seven
Key : Eighth key => Value : Value Eight

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM