简体   繁体   中英

How many characters are in Java

How many unique characters exist in Java? I've looped to over 10,000, and characters are still being found:

for (int i = 0; i < 10000; i++)
    System.out.println((char) i);

Are there Integer.MAX characters? I always thought there was only 255 for some reason

Java uses Unicode. Unicode code points are from U+0000 to U+10FFFF, which makes quite a lot.

But not all of them are defined. If you want to know how many of them are "supported", you can use that:

final long nrChars = IntStream.rangeClosed(0, 0x10ffff)
    .mapToObj(Character.UnicodeBlock::of)
    .filter(Objects::nonNull)
    .count();

Also note that due to historical reasons, Java's char can only represent directly code points up to U+FFFF. For the "rest" (which is now pretty much the majority of defined code points), Java uses a surrogate pair. See Character.toChars() .

Java was designed to use internally Unicode, so diverse scripts could be combined in one String. Unicode is a numbering of all scripts going into the 3 byte range. Such Unicode "code points" are represented as int in java.

At that time char and String were for text, char using UTF-16 (an Unicode representation using 16 bits, sometime with two chars for a Unicode code point. (However String constants in a .class file are in UTF-8.)

char hence takes 2 bytes. byte takes 1 byte and byte[] is for binary data.

In earlier languages (C, C++) there was often no such distinction between char and byte .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM