I can't find the correct way to remove substrings case insensitive equals to "null" and replacing them with an empty string against a huge input data string, which contains many lines and uses ; as a separator.
To simplify here is an example of what I am looking for:
Input string
Steve;nuLL;2;null\n
null;nullo;nUll;Marc\n
....
Expected Output
Steve;;2;\n
;nullo;;Marc\n
...
Code
Matcher matcher = Pattern.compile("(?i)(^|;)(null)(;|$)").matcher(dataStr);
StringBuffer sb = new StringBuffer();
while (matcher.find()) {
matcher.appendReplacement(sb, matcher.group(1) + "" + matcher.group(3));
}
return sb.toString();
Can this be solved by using regex?
EDIT:
From the java code above I only get the first match ever being replaced, but not every appearance in the line and in the data stream. For whatever reason the matcher.find()
is only executed once.
return dataStr.replaceAll("(?smi)\\bnull\\b", "");
\\b
is the word boundary. (?i)
is a command with i=ignore case. (?s)
is DOT_ALL, .
matching newline characters too.) (?m)
is MULTI_LINE. You forgot appendTail
, for all after the last replacement. If the string contains more than one line, add the MULTI_LINE option for reinterpretation of ^
and $
. See the javadoc of Pattern
.
while (matcher.find()) {
matcher.appendReplacement(sb, matcher.group(1) + "" + matcher.group(3));
}
matcher.appendTail(sb);
Alternatively with lambda:
String result = matcher.replaceAll(mr -> mr.group(1) + mr.group(3));
where mr
is a freely named MatchResult
provided by replaceAll
.
你可能什么来代替null
,只要它后面是一些字符,如:
first.replaceAll("(?i)(null)(?=[;$\\\n])", "")
You don't need anything fancy:
str = str.replaceAll("(?i)\\bnull\\b", "");
(?1)
means "ignore case". \\b
means "word boundary". Embedded newlines are irrelevant.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.