简体   繁体   English

如何从Unicode字符串中删除其他符号块

[英]How to remove Miscellaneous Symbols Block from the Unicode String

I want to remove miscellaneous symbols block from Unicode string using regular expression may also try but none of them regular expression I think current can anyone help me for this issue how I can remove miscellaneous symbols block from the string. 我想使用正则表达式remove miscellaneous symbols block from Unicode string也可以尝试,但是它们中的任何一个都不可以使用正则表达式。我认为当前有人可以解决此问题,我如何从字符串中删除其他符号块。

Unicode String Unicode字串

\u263A\uD83D\uDE0A\uD83D\uDE22)\uD83C\uDF82

Code: 码:

String input = "\u263A\uD83D\uDE0A\uD83D\uDE22)\uD83C\uDF82";
input.replaceAll("[\u2600-\u26FF]|[\u2700-\u27BF]", "");

Expected: 预期:

\uD83D\uDE0A\uD83D\uDE22)\uD83C\uDF82

but it will be not working how I can solve this issue. 但我无法解决此问题。

It not work because String is immutable in Java you have to use assign the result to the input like this : 它不起作用,因为String在Java中是不可变的 ,您必须使用将结果分配给输入的方式如下:

String result = input.replaceAll("[\u2600-\u26FF]|[\u2700-\u27BF]", "");

Or simply : 或者简单地:

input = input.replaceAll("[\u2600-\u26FF]|[\u2700-\u27BF]", "");

So if you make a print like this : 因此,如果您进行这样的打印:

System.out.println(input);
System.out.println("\uD83D\uDE0A\uD83D\uDE22)\uD83C\uDF82");

Both gives : 两者都给出:

😊😢)🎂
😊😢)🎂

If the Input text contains u-escaped characters, as text consisting out of a backslash, 'u' and 4 hexadecimal Digits, convert them first to real char s. 如果输入文本包含转义的u字符(由反斜杠,“ u”和4个十六进制数字组成的文本),请先将其转换为实char

input = StringEscapeUtils.unescapeJava(Input); // From Apache commons
input = input.replaceAll("[\u2600-\u26FF]|[\u2700-\u27BF]", "");

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM