[英]Eliminating Unicode Characters and Escape Characters from String
I want to remove all Unicode Characters and Escape Characters like (\\n, \\t)
etc. In short I want just alphanumeric string. 我想删除所有的Unicode字符和转义字符,例如(\\n, \\t)
等。总之,我只需要字母数字字符串。
For example : 例如 :
\
My Actual String\
\\nMy Actual String\\n
I want to fetch just 'My Actual String'
. 我只想获取'My Actual String'
。 Is there any way to do so, either by using a built in string method or a Regular Expression ? 有没有办法通过使用内置字符串方法或正则表达式来做到这一点?
Try this: 尝试这个:
anyString = anyString.replaceAll("\\\\u\\d{4}|\\\\.", "");
to remove escaped characters. 删除转义字符。 If you also want to remove all other special characters use this one: 如果您还想删除所有其他特殊字符,请使用此字符:
anyString = anyString.replaceAll("\\\\u\\d{4}|\\\\.|[^a-zA-Z0-9\\s]", "");
(I guess you want to keep the whitespaces, if not remove \\\\s
from the one above) (我想您要保留空格,如果不从上面的空格中删除\\\\s
的话)
Try 尝试
String stg = "\u2029My Actual String\u2029 \nMy Actual String";
Pattern pat = Pattern.compile("(?!(\\\\(u|U)\\w{4}|\\s))(\\w)+");
Matcher mat = pat.matcher(stg);
String out = "";
while(mat.find()){
out+=mat.group()+" ";
}
System.out.println(out);
The regex matches all things except unicode and escape characters. 正则表达式匹配所有字符,除了unicode和转义字符。 The regex pictorially represented as: 正则表达式的图形表示为:
Output : 输出 :
My Actual String My Actual String
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.