简体   繁体   中英

Java Unicode strings sorting

In Java, how does Unicode strings get compared?

What I mean is, if I have a few say, Japanese strings, when I do the following:

java.util.Arrays.sort(arrayOfJapaneseStrings);

how does those strings get compared and sorted?

By default, Strings sort lexicographically, by Unicode order. The order is by UTF-16, so might not be exactly what you want for certain characters, but Japanese characters are all in the BMP , so you shouldn't have a problem with these.

If you would like a different sort order, you can use the java.text.Collator classes to define a different sort order.

By default it's in UTF-16 byte-code comparison. This is the fastest way, and hence perfect if all you need is some order (eg if you are going to use a binary search later, you need them to be in order, but just what "in order" means doesn't matter, so the faster the better).

If you need an ordering that is sensible to a user in a given locale, use the java.text.Collator class.

According to compareTo methodof String class. See the javadoc :

Compares two strings lexicographically. The comparison is based on the Unicode value of each character in the strings. The character sequence represented by this String object is compared lexicographically to the character sequence represented by the argument string. The result is a negative integer if this String object lexicographically precedes the argument string. The result is a positive integer if this String object lexicographically follows the argument string. The result is zero if the strings are equal; compareTo returns 0 exactly when the {@link #equals(Object)} method would return true .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM