[英]How to remove part of string that includes special characters by RegEx or replaceAll?
以下是字符串:
1. "AAA BBB CCCCC CCCCCCC"
2. " AAA BBB DDDD DDDD DDDDD"
3. " EEE FFF GGGGG GGGGG"
開頭和第一個和第二個單詞之間的空格可以變化。 所以我需要一個RegEx來刪除第三個字之前的所有內容,所以它總是返回“CCCCC CCCCCCC”或“DDDD DDDD DDDDD”或“GGGGG GGGGG”。 假設它可以通過RegEx完成,而不是解析字符串中的所有單詞
您需要使用組匹配來解析所需的數據
String result = null;
try {
Pattern regex = Pattern.compile("\\s*\\w+\\s*\\w+\\s*([\\w| ]+)");
Matcher regexMatcher = regex.matcher(" AAA BBB DDDD DDDD DDDDD");
if (regexMatcher.find()) {
result = regexMatcher.group(1); // result = "DDDD DDDD DDDDD"
}
} catch (PatternSyntaxException ex) {
// Syntax error in the regular expression
}
正則表達式解釋
"\\s" + // Match a single character that is a “whitespace character” (spaces, tabs, and line breaks)
"*" + // Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
"\\w" + // Match a single character that is a “word character” (letters, digits, and underscores)
"+" + // Between one and unlimited times, as many times as possible, giving back as needed (greedy)
"\\s" + // Match a single character that is a “whitespace character” (spaces, tabs, and line breaks)
"*" + // Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
"\\w" + // Match a single character that is a “word character” (letters, digits, and underscores)
"+" + // Between one and unlimited times, as many times as possible, giving back as needed (greedy)
"\\s" + // Match a single character that is a “whitespace character” (spaces, tabs, and line breaks)
"*" + // Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
"(" + // Match the regular expression below and capture its match into backreference number 1
"[\\w| ]" + // Match a single character present in the list below
// A word character (letters, digits, and underscores)
// One of the characters “| ”
"+" + // Between one and unlimited times, as many times as possible, giving back as needed (greedy)
")"
這個正則表達式將起作用
\s*\w+\s+\w+\s+(.+$)
JAVA代碼
String pattern = "(?m)\\s*\\w+\\s+\\w+\\s+(.+$)";
String line = "AAA BBB CCCCC CCCCCCC\n AAA BBB DDDD DDDD DDDDD\n EEE FFF GGGGG GGGGG";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(line);
while (m.find()) {
System.out.println("Found value: " + m.group(1) );
}
與@ rock321987的答案類似,您可以修改正則表達式以使用量詞來忽略您不想要的任何數量的前面單詞。
\s*(?:\w+\s+){2}(.+$)
或者在Java中:
"\\s*(?:\\w+\\s+){2}(.+$)"
?:使()中的模式成為非捕獲組。 {}中的數字是您要忽略的空格后面的單詞數。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.