简体   繁体   English

为什么 JDK 中的一些可比较类将比较函数限制为 {−1, 0, 1} 而有些则没有?

[英]Why do some comparable classes in the JDK limit the comparison function to {−1, 0, 1} and some don't?

The only return value the java.lang.Comparable<T> interface explicitly requires is 0 for when T a and T b are equal.T aT b相等时, java.lang.Comparable<T>接口明确要求的唯一返回值是 0。 If a is less than b , then compare(a, b) must be negative, not necessarily −1, and compare(b, a) must be positive, not necessarily 1.如果a小于b ,则compare(a, b)必须是负数,不一定是 -1,并且compare(b, a)必须是正数,不一定是 1。

And yet some comparable classes in the JDK limit the output of the comparison function precisely in that manner, eg,然而 JDK 中的一些可比较的类以这种方式精确地限制了比较函数的输出,例如,

scala> (65 to 90).map(n => java.lang.Integer.compare(77, n))
res19: IndexedSeq[Int] = Vector(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1)

and some don't, eg,有些没有,例如,

scala> ('A' to 'Z').map(ch => java.lang.Character.compare('M', ch))
res10: IndexedSeq[Int] = Vector(12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0, 
-1, -2, -3, -4, -5, -6, -7, -8, -9, -10, -11, -12, -13)

I was well aware of the somewhat obscure case of RDNs (from javax.naming.ldap ), but I wasn't expecting to come across this with anything from java.lang .我很清楚 RDN 有点晦涩的情况(来自javax.naming.ldap ),但我没想到会在java.lang遇到这种情况。 I first noticed unrestricted output in Character.compare() in the context of a Caesar cypher program in Java, but I find it easier to run "experiments" like these in the local Scala REPL.我首先注意到在 Java 中的 Caesar 密码程序的上下文中Character.compare()中的无限制输出,但我发现在本地 Scala REPL 中运行这样的“实验”更容易。

When I wrote my implementation of Fraction , I followed the example of Integer rather than Character .当我编写Fraction实现时,我遵循了Integer而不是Character的例子。

scala> val fractA = new fractions.Fraction(65, 128)
fractA: fractions.Fraction = 65/128

scala> val fractB = new fractions.Fraction(90, 128)
fractB: fractions.Fraction = 45/64

scala> fractA to fractB
res20: fractions.FractionRange = 65/128 to 45/64

scala> val fractC = new fractions.Fraction(77, 128)
fractC: fractions.Fraction = 77/128

scala> res20.map(_.compare(fractC))
res21: IndexedSeq[Int] = Vector(-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 0, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)

scala> res20.map(fractC.compare)
res22: IndexedSeq[Int] = Vector(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1)

scala> res19 == res22
res23: Boolean = true

In the case of Fraction , it's no problem to implement it like this.Fraction的情况下,像这样实现它是没有问题的。 Actually, it would be more trouble to not do it like this.实际上,不这样做会更麻烦。 The numerator and denominator are of type long , so there could be problems if the numerator of the difference is just a little outside the range of int .分子和分母是long类型,因此如果差的分子稍微超出int的范围,则可能会出现问题。 By using Long.signum() , I reduce the possibility of wrong results to a small set of edge cases.通过使用Long.signum() ,我将错误结果的可能性降低到一Long.signum()边缘情况。

Since char maps to half the range of int , I suppose it's easier for String to not restrict the result of compare() to {−1, 0, 1}.由于char映射到int范围的一半,我认为String不将compare()的结果限制为 {−1, 0, 1} 会更容易。

scala> "Hello, World!" compare "Hello, world?"
res30: Int = -32

Here I'm guessing that if no surrogates are involved, or maybe even if surrogates are involved, it's easier to just run Character.compare() on each character of the String until either the first nonzero result or reaching the end.在这里,我猜测如果不涉及代理,或者即使涉及代理,也更容易对String每个字符运行Character.compare()直到第一个非零结果或到达末尾。

Is this the explanation, that one should do whatever is easiest and gives the correct results?这就是解释,应该做最简单的事情并给出正确的结果吗? Or is there a deeper reason to restrict to signum of the difference in some comparable classes and not others?或者是否有更深层次的理由来限制某些可比类而不是其他类的差异?

Main answer主要回答

Because implementations are free to choose whatever positive or negative value suits them best.因为实现可以自由选择最适合它们的正值或负值。 The spec says about the return value: 规范说明了返回值:

a negative integer, zero, or a positive integer as this object is less than, equal to, or greater than the specified object.负整数、零或正整数,因为此对象小于、等于或大于指定的对象。

One might argue whether this spec was a wise decision or not, but that's the way it has been defined.有人可能会争论这个规范是否是一个明智的决定,但这就是它的定义方式。 So, never rely on the results of a comparison being only -1, 0, or 1, even if experimenting with one specific Java version shows that behaviour - it might change with the next release.因此,永远不要依赖于只有 -1、0 或 1 的比较结果,即使对一个特定 Java 版本的试验显示出这种行为 - 它可能会随着下一个版本而改变。

Implementations answer实现答案

This answer is already found in your question, mainly.这个答案主要已经在你的问题中找到了。

There are two typical ways of implementing comparison:比较典型的实现方式有两种:

  • Integer Subtraction: that gives results not limited to -1, 0, 1. The code is simple, elegant and fast.整数减法:给出的结果不限于-1、0、1。代码简单、优雅、快速。 But there can be overflow, eg for values 2_000_000_000 - (-2_000_000_000) mathematically is 4_000_000_000, but with 32-bit int the result shows as -294_967_296, falsely implying that 2_000_000_000 is less than -2_000_000_000.但是可能存在溢出,例如对于值 2_000_000_000 - (-2_000_000_000) 数学上是 4_000_000_000,但对于 32 位整数,结果显示为 -294_967_296,错误地暗示 2_000_0_0_0_0_0_0 小于 -000_0 To avoid overflow, int subtraction works for numbers up to roughly +/- 1_000_000_000.为了避免溢出,int 减法适用于大约 +/- 1_000_000_000 的数字。
  • Decision: this typically needs an if ... else if ... else construct where the return values for the three cases are explicitly given.决定:这通常需要一个if ... else if ... else构造,其中明确给出了三种情况的返回值。 Then it's a natural choice to use -1, 0, and 1, and I don't know of an implementation using other fixed values.那么使用 -1、0 和 1 是很自然的选择,我不知道使用其他固定值的实现。

So, subtraction is a valid solution for byte and char, where an int-based subtraction has enough reserve bits that overflow can't happen.因此,减法是 byte 和 char 的有效解决方案,其中基于 int 的减法具有足够的保留位,不会发生溢出。 So, it's more likely for those datataypes and their derivatives to show values outside of -1, 0, and 1.因此,这些数据类型及其衍生物更有可能显示 -1、0 和 1 之外的值。

Fraction class分数类

You're wrting about a Fraction class you're implementing.您正在编写正在实现的Fraction类。 If two Fraction instances can be created, not giving an exception, I'd require the compareTo() method to give correct results, always.如果可以创建两个 Fraction 实例,而不给出异常,我会要求compareTo()方法始终给出正确的结果。 As comparison of fractions is a tricky thing, overflows of intermediate results can be expected.由于分数的比较是一件棘手的事情,因此可以预期中间结果的溢出。 So, I'd recommend to create some test cases with numerators and/or denominators close to the validity limits (whatever you define them to be).因此,我建议创建一些测试用例,其分子和/或分母接近有效性限制(无论您定义它们是什么)。

Another approach to avoid overflow would be switching to the unlimited-range BigInteger type, but that might have a performance impact.另一种避免溢出的方法是切换到无限范围的BigInteger类型,但这可能会影响性能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM