简体   繁体   中英

How do I convert Unicode U+xxxx representation of an emoji into the emoji in Java?

I have a string with Unicode representation of emoji like this:

"Hello U+1F601"

I want to convert this into:

"Hello "

I have tried decoding them using the parseInt() method and converting it to char , but I keep getting a black and white glyph like this -.

Any pointers how can I can achieve the intended result with Java?

PS: unescapeJava() method doesn't work here. I have tried it and also some other answers from other similar threads.

U+1F601 describesa Unicode codepoint .

Often when you want to convert a codepoint to a String in Java then what you describe will work.

However it only works when the codepoint is in the Basic Multilingual Plane , which basically means it's smaller than U+10000 (ie at most 4 hex digits). The BMP includes most frequently used characters, but is notably not home to many newer emojis.

If it's above that point, then you need to use two char values to combine into a single codepoint with some math .

Luckily you don't have to do that math on your own, but can use this version of Character.toString instead:

Character.toString(0x1F601);

And to fully implement the replacement, we can simply use Matcher.replaceAll :

String input = "Hello U+1F601";
Pattern p = Pattern.compile("U\\+([0-9a-fA-F]{4,6})");
String result = p.matcher(input).replaceAll(r -> Character.toString(Integer.parseInt(r.group(1), 16)));
System.out.println(result);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM