[英]Splitting by java regular expression
我有一個像這樣的字符串:
Snt:It was the most widespread day of environmental action in the planet's history
====================
-----------
Snt:Five years ago, I was working for just over minimum wage
====================
-----------
我想用
====================
-----------
當然從句子的第一句中刪除Snt:
什么是最好的方法?
我用了這個正則表達式,但是沒用!
String[] content1 =content.split("\\n\\====================\\n\\-----------\\n");
提前致謝。
關於什么
Pattern p = Pattern.compile("^Snt:(.*)$", Pattern.MULTILINE);
Matcher m = p.matcher(str);
while (m.find()) {
String sentence = m.group(1);
}
而不是黑客各地的split
,做額外的解析,這只是看起來與“SNT”,然后捕獲任何如下開始的行。
由於數據的結構方式,我將把拆分的概念顛倒過來,成為匹配器。,這也使您可以很好地對Snt
進行數學計算:
private static final String VAL = "Snt:It was the most widespread day of environmental action in the planet's history\n"
+ "====================\n"
+ "-----------\n"
+ "Snt:Five years ago, I was working for just over minimum wage\n"
+ "====================\n"
+ "-----------";
public static void main(String[] args) {
List<String> phrases = new ArrayList<String>();
Matcher mat = Pattern.compile("Snt:(.+?)\n={20}\n-{11}\\s*").matcher(VAL);
while (mat.find()) {
phrases.add(mat.group(1));
}
System.out.printf("Value: %s%n", phrases);
}
我使用正則表達式: "Snt:(.+?)\\n={20}\\n-{11}\\\\s*"
假設文件中的第一個單詞是Snt:
然后將下一個短語分組,直到定界符為止。 它將占用任何結尾的空格,使表達式為下一條記錄做好准備。
此過程的好處是,匹配項匹配單個記錄,而不是具有與一個記錄的結尾部分(也許是下一個記錄的開頭)部分匹配的表達式。
由於最后沒有換行符,因此它將不匹配最后的==
, --
行。 您需要在最后添加行錨$
的末尾,以替代正則表達式中\\n
。
String s = "Snt:It was the most widespread day of environmental action in the planet's history\n" +
"====================\n" +
"-----------\n" +
"Snt:Five years ago, I was working for just over minimum wage\n" +
"====================\n" +
"-----------";
String m = s.replaceAll("(?m)^Snt:", "");
String[] tok = m.split("\\n\\====================\\n\\-----------(?:\\n|$)");
System.out.println(Arrays.toString(tok));
輸出:
[It was the most widespread day of environmental action in the planet's history, Five years ago, I was working for just over minimum wage]
Matcher m = Pattern.compile("([^=\\-]+)([=\\-]+[\\t\\n\\s]*)+").matcher(str);
while (m.find()) {
String match = m.group(1);
System.out.println(match);
}
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.