简体   繁体   中英

How to replace multiple consecutive occurrences of a character with a maximum allowed number of occurences?

CharSequence content = new StringBuffer("aaabbbccaaa");
String pattern = "([a-zA-Z])\\1\\1+";
String replace = "-";

Pattern patt = Pattern.compile(pattern, Pattern.CASE_INSENSITIVE);
Matcher matcher = patt.matcher(content);

boolean isMatch = matcher.find();
StringBuffer buffer = new StringBuffer();

for (int i = 0; i < content.length(); i++) {
    while (matcher.find()) {
        matcher.appendReplacement(buffer, replace);
    }
}
matcher.appendTail(buffer);
System.out.println(buffer.toString());

In the above code content is input string,

I am trying to find repetitive occurrences from string and want to replace it with max no of occurrences

For Example

input - ("abaaadccc",2)
output - "abaadcc"
here aaa and ccc is replced by aa and cc as max allowed repitation is 2

In the above code, I found such occurrences and tried replacing them with - , it's working, But can someone help me How can I get current char and replace with allowed occurrences

ie If aaa is found it is replaced by aa

or is there any alternative method w/o using regex?

You can declare the second group in a regex and use it as a replacement:

String result = "aaabbbccaaa".replaceAll("(([a-zA-Z])\\2)\\2+", "$1");

Here's how it works:

(                        first group - a character repeated two times
    ([a-zA-Z])           second group - a character
    \2                   a character repeated once
)                        
\2+                      a character repeated at least once more

Thus, the first group captures a replacement string.

It isn't hard to extrapolate this solution for a different maximum value of allowed repeats:

String input = "aaaaabbcccccaaa";
int maxRepeats = 4;
String pattern = String.format("(([a-zA-Z])\\2{%s})\\2+", maxRepeats-1);
String result = input.replaceAll(pattern, "$1");
System.out.println(result); //aaaabbccccaaa

Since you defined a group in your regex, you can get the matching characters of this group by calling matcher.group(1) . In your case it contains the first character from the repeating group so by appending it twice you get your expected result.

    CharSequence content = new StringBuffer("aaabbbccaaa");
    String pattern = "([a-zA-Z])\\1\\1+";

    Pattern patt = Pattern.compile(pattern, Pattern.CASE_INSENSITIVE);
    Matcher matcher = patt.matcher(content);

    StringBuffer buffer = new StringBuffer();

    while (matcher.find()) {
        System.out.println("found : "+matcher.start()+","+matcher.end()+":"+matcher.group(1));
        matcher.appendReplacement(buffer, matcher.group(1)+matcher.group(1));
    }
    matcher.appendTail(buffer);
    System.out.println(buffer.toString());

Output:

found : 0,3:a
found : 3,6:b
found : 8,11:a
aabbccaa

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM