[英]How to clean a string from non-alphanumeric characters, but keep certain ones?
I have a string that has non-alphanumeric characters, this string contains English and non English alphabets.我有一个包含非字母数字字符的字符串,该字符串包含英文和非英文字母。 I need to clean the string from non-alphanumeric characters, but I want to keep some of them.
我需要从非字母数字字符中清除字符串,但我想保留其中一些。 For instance: Let's say that I want to keep comma and colon only.
例如:假设我只想保留逗号和冒号。
Example: String st = "I, Love: ( Coding {}+-), codificación"
示例:
String st = "I, Love: ( Coding {}+-), codificación"
I want the output to be "I,Love:Coding,codificación"
我希望 output 成为
"I,Love:Coding,codificación"
Is there a regex that can do that?有没有可以做到这一点的正则表达式?
Note the method below will clean the text from all non-alphanumeric characters.请注意,以下方法将从所有非字母数字字符中清除文本。
public static String cleanText(String text) {
return text.replaceAll("\\P{LD}+", "");
}
You can use您可以使用
public static String cleanText(String text) {
return text.replaceAll("[^\\p{L}\\p{N}:,]+", "");
// or return text.replaceAll("[^\\p{LD}:,]+", "");
}
Details :详情:
[^
- start of a negated character class [^
- 否定字符 class 的开始
\p{L}
- any Unicode letter \p{L}
- 任何 Unicode 字母\p{N}
- any digit \p{N}
- 任何数字:
- a colon :
- 一个冒号,
- a comma ,
- 逗号]+
- end of the character class, repeat one or more times. ]+
- 字符 class 的结尾,重复一次或多次。 See the regex demo .请参阅正则表达式演示。 See a Java demo :
请参阅Java 演示:
import java.util.*;
import java.io.*;
class Test
{
public static void main (String[] args) throws java.lang.Exception
{
String st = "I, Love: ( Coding {}+-), codificación";
System.out.println(cleanText(st));
}
public static String cleanText(String text) {
return text.replaceAll("[^\\p{L}\\p{N}:,]+", "");
}
}
// => I,Love:Coding,codificación
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.