简体   繁体   中英

How to Convert UTF-16 Surrogate Decimal to UNICODE in Java

I have some string data like

&#55357 ;&#56842 ;

These are surrogate pairs in UTF 16 in decimal format.

How can I convert them to Unicode Code Points in Java, so that my client can understand the Unicode decimal html entity without the surrogate pair?

Example: &#128522 ; - Get this response for the above string

Assuming you already parsed the string to get the 2 numbers, just create a String from those two char values:

String s = new String(new char[] { 55357, 56842 });
System.out.println(s);

Output

😊

To get the code point of that:

s.codePointAt(0) // returns 128522

You don't have to create a string though:

Character.toCodePoint((char) 55357, (char) 56842) // returns 128522

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM