简体   繁体   中英

How to put a supplementary Unicode character in a string literal?

How to put a supplementary Unicode character (say, codepoint 10400 ) in a string literal? I have tried putting a surrogate pair like this:

String text = "TEST \uD801\uDC00";
System.out.println(text);

but it doesn't seem to work.

UPDATE:

The good news is, the string is constructed properly.
Byte array in UTF-8: 54 45 53 54 20 f0 90 90 80
Byte array in UTF-16: fe ff 0 54 0 45 0 53 0 54 0 20 d8 1 dc 0

But the bad news is, it is not printed properly (in my Fedora box) and I can see a square instead of the expected symbol (my console didn't support unicode properly).

"Works for me", what exactly is the issue?

public static void main (String[] args) throws Exception {
    int cp = 0x10400;
    String text = "test \uD801\uDC00";
    System.out.println("cp:    " + cp);
    System.out.println("found: " + text.codePointAt(5));
    System.out.println("len:   " + text.length());
}

Output:

cp:    66560
found: 66560
len:   7

Note that length -- like most String methods -- deals with char s, not Unicode characters. So much for awesome Unicode support :)

Happy coding.

It is supposed to work using:

System.out.println(
    "text = " + new String(Character.toChars(h))
);

But the output is:

text = ?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM