简体   繁体   中英

How to remove specific repeated characters from text?

I have a String like

"this is line 1\n\n\nthis is line 2\n\n\nthis is line 3\t\t\tthis is line 3 also"

What I want to do is remove repeated specific characters like "\\n", "\\t" from this text.

"this is line 1\nthis is line 2\nthis is line 3\tthis is line 3 also"

I tried some regular expressions but didn't work for me.

text = text.replace("/[^\\w\\s]|(.)\\1/gi", ""); 

Is there any regex for this?

If you need to only remove sepcific whitespace chars, \\s won't help as it will overmatch, ie it will also match spaces, hard spaces, etc.

You may use a character class with the chars, wrap them with a capturing group and use a backreference to the value captured. And replace with the backreference to the Group 1 value:

.replaceAll("([\n\t])\\1+", "$1")

See the regex demo .

Details

  • ([\\n\\t]) - Group 1 (referred to with \\1 from the pattern and $1 from the replacement pattern): a character class matching either a newline or tab symbols
  • \\1+ - one or more repetitions of the value in Group 1.

我会用番石榴的CharMatcher

CharMatcher.javaIsoControl().removeFrom(myString)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM