简体   繁体   English

Java 中的字典顺序

[英]Lexicographic Order in Java

How is the lexicographic order defined in Java especially in reference to special characters like ! Java 中的字典顺序是如何定义的,特别是在参考特殊字符时,例如! , . , . and so on?等等?

An examplary order can be found here可以在此处找到示例订单

But how does Java define it's order?但是Java如何定义它的顺序呢? I ask because I'm sorting Strings on Java and on Oracle and come up with different results and can't find the specification for the lexicographic order.我问是因为我在 Java 和 Oracle 上对字符串进行排序并得出不同的结果并且找不到字典顺序的规范。

From the docs for String.compareTo :来自String.compareTo的文档:

Compares two strings lexicographically.按字典顺序比较两个字符串。 The comparison is based on the Unicode value of each character in the strings.比较基于字符串中每个字符的 Unicode 值。

and

This is the definition of lexicographic ordering.这是字典排序的定义。 If two strings are different, then either they have different characters at some index that is a valid index for both strings, or their lengths are different, or both.如果两个字符串不同,那么它们要么在对两个字符串都是有效索引的某个索引处具有不同的字符,要么它们的长度不同,或者两者都有。 If they have different characters at one or more index positions, let k be the smallest such index;如果它们在一个或多个索引位置有不同的字符,则令 k 为最小的此类索引; then the string whose character at position k has the smaller value, as determined by using the < operator, lexicographically precedes the other string.则在位置 k 处的字符具有较小值的字符串(通过使用 < 运算符确定)按字典顺序排在另一个字符串之前。 In this case, compareTo returns the difference of the two character values at position k in the two string [...]在这种情况下,compareTo 返回两个字符串 [...] 中位置 k 处的两个字符值的差值

So basically, it treats each string like a sequence of 16-bit unsigned integers.所以基本上,它将每个字符串视为一个 16 位无符号整数序列。 No cultural awareness, no understanding of composite characters etc. If you want a more complex kind of sort, you should be looking at Collator .没有文化意识,不了解复合字符等。如果你想要更复杂的排序,你应该看看Collator

In Java it's based on the Unicode value of the string:在 Java 中,它基于字符串的 Unicode 值:

http://download.oracle.com/javase/1.4.2/docs/api/java/lang/String.html#compareTo(java.lang.String ) http://download.oracle.com/javase/1.4.2/docs/api/java/lang/String.html#compareTo(java.lang.String )

In Oracle, it will depend on the charset you are using on your database.在 Oracle 中,它将取决于您在数据库上使用的字符集。 You'll want it to be UTF-8 to have consistent behavior with Java.您会希望它是 UTF-8,以便与 Java 具有一致的行为。

To check the character set:要检查字符集:

SQL> SELECT parameter, value FROM nls_database_parameters 
     WHERE parameter = 'NLS_CHARACTERSET';

PARAMETER             VALUE 
------------------    ---------------------
NLS_CHARACTERSET      UTF8

If it's not UTF-8, then you can get different comparison behavior depending on which character set your Oracle database is using.如果它不是 UTF-8,那么根据您的 Oracle 数据库使用的字符集,您可以获得不同的比较行为。

from the javadocs :来自javadocs

The comparison is based on the Unicode value of each character in the strings.比较基于字符串中每个字符的 Unicode 值。

more detailed:更详细:

This is the definition of lexicographic ordering.这是字典排序的定义。 If two strings are different, then either they have different characters at some index that is a valid index for both strings, or their lengths are different, or both.如果两个字符串不同,那么它们要么在对两个字符串都是有效索引的某个索引处具有不同的字符,要么它们的长度不同,或者两者都有。 If they have different characters at one or more index positions, let k be the smallest such index;如果它们在一个或多个索引位置有不同的字符,则令 k 为最小的此类索引; then the string whose character at position k has the smaller value, as determined by using the < operator, lexicographically precedes the other string.则在位置 k 处的字符具有较小值的字符串(通过使用 < 运算符确定)按字典顺序排在另一个字符串之前。 In this case, compareTo returns the difference of the two character values at position k in the two string ...在这种情况下, compareTo 返回两个字符串中位置 k 处的两个字符值的差...

Hope this helps!!希望这可以帮助!!

Employee sorted based on the descending order of the score and if two different employee has same score, then we need to consider Employee name for sorting lexicographically. Employee 根据分数的降序排序,如果两个不同的员工有相同的分数,那么我们需要考虑 Employee name 进行字典序排序。

Employee class implementation: (Used Comparable interface for this case.) Employee 类实现:(在这种情况下使用 Comparable 接口。)

@Override
public int compareTo(Object obj) {
    Employee emp = (Employee) obj;

    if(emp.getScore() > this.score) return 1;
    else if(emp.getScore() < this.score) return -1;
    else
        return emp.getEmpName().compareToIgnoreCase(this.empName) * -1;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM