Why do some comparable classes in the JDK limit the comparison function to {−1, 0, 1} and some don't?

Question

The only return value the java.lang.Comparable<T> interface explicitly requires is 0 for when T a and T b are equal. If a is less than b , then compare(a, b) must be negative, not necessarily −1, and compare(b, a) must be positive, not necessarily 1.

And yet some comparable classes in the JDK limit the output of the comparison function precisely in that manner, eg,

scala> (65 to 90).map(n => java.lang.Integer.compare(77, n))
res19: IndexedSeq[Int] = Vector(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1)

and some don't, eg,

scala> ('A' to 'Z').map(ch => java.lang.Character.compare('M', ch))
res10: IndexedSeq[Int] = Vector(12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0, 
-1, -2, -3, -4, -5, -6, -7, -8, -9, -10, -11, -12, -13)

I was well aware of the somewhat obscure case of RDNs (from javax.naming.ldap ), but I wasn't expecting to come across this with anything from java.lang . I first noticed unrestricted output in Character.compare() in the context of a Caesar cypher program in Java, but I find it easier to run "experiments" like these in the local Scala REPL.

When I wrote my implementation of Fraction , I followed the example of Integer rather than Character .

scala> val fractA = new fractions.Fraction(65, 128)
fractA: fractions.Fraction = 65/128

scala> val fractB = new fractions.Fraction(90, 128)
fractB: fractions.Fraction = 45/64

scala> fractA to fractB
res20: fractions.FractionRange = 65/128 to 45/64

scala> val fractC = new fractions.Fraction(77, 128)
fractC: fractions.Fraction = 77/128

scala> res20.map(_.compare(fractC))
res21: IndexedSeq[Int] = Vector(-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 0, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)

scala> res20.map(fractC.compare)
res22: IndexedSeq[Int] = Vector(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1)

scala> res19 == res22
res23: Boolean = true

In the case of Fraction , it's no problem to implement it like this. Actually, it would be more trouble to not do it like this. The numerator and denominator are of type long , so there could be problems if the numerator of the difference is just a little outside the range of int . By using Long.signum() , I reduce the possibility of wrong results to a small set of edge cases.

Since char maps to half the range of int , I suppose it's easier for String to not restrict the result of compare() to {−1, 0, 1}.

scala> "Hello, World!" compare "Hello, world?"
res30: Int = -32

Here I'm guessing that if no surrogates are involved, or maybe even if surrogates are involved, it's easier to just run Character.compare() on each character of the String until either the first nonzero result or reaching the end.

Is this the explanation, that one should do whatever is easiest and gives the correct results? Or is there a deeper reason to restrict to signum of the difference in some comparable classes and not others?

Answer 1

Main answer

Because implementations are free to choose whatever positive or negative value suits them best. The spec says about the return value:

a negative integer, zero, or a positive integer as this object is less than, equal to, or greater than the specified object.

One might argue whether this spec was a wise decision or not, but that's the way it has been defined. So, never rely on the results of a comparison being only -1, 0, or 1, even if experimenting with one specific Java version shows that behaviour - it might change with the next release.

Implementations answer

This answer is already found in your question, mainly.

There are two typical ways of implementing comparison:

Integer Subtraction: that gives results not limited to -1, 0, 1. The code is simple, elegant and fast. But there can be overflow, eg for values 2_000_000_000 - (-2_000_000_000) mathematically is 4_000_000_000, but with 32-bit int the result shows as -294_967_296, falsely implying that 2_000_000_000 is less than -2_000_000_000. To avoid overflow, int subtraction works for numbers up to roughly +/- 1_000_000_000.
Decision: this typically needs an if ... else if ... else construct where the return values for the three cases are explicitly given. Then it's a natural choice to use -1, 0, and 1, and I don't know of an implementation using other fixed values.

So, subtraction is a valid solution for byte and char, where an int-based subtraction has enough reserve bits that overflow can't happen. So, it's more likely for those datataypes and their derivatives to show values outside of -1, 0, and 1.

Fraction class

You're wrting about a Fraction class you're implementing. If two Fraction instances can be created, not giving an exception, I'd require the compareTo() method to give correct results, always. As comparison of fractions is a tricky thing, overflows of intermediate results can be expected. So, I'd recommend to create some test cases with numerators and/or denominators close to the validity limits (whatever you define them to be).

Another approach to avoid overflow would be switching to the unlimited-range BigInteger type, but that might have a performance impact.

Why do some comparable classes in the JDK limit the comparison function to {−1, 0, 1} and some don't?

Question

1 answers

solution1
3 ACCPTED 2020-09-22 07:52:59

Main answer

Implementations answer

Fraction class

Why do some comparable classes in the JDK limit the comparison function to {−1, 0, 1} and some don't?

Question

1 answers

solution1 3 ACCPTED 2020-09-22 07:52:59

Main answer

Implementations answer

Fraction class

solution1
3 ACCPTED 2020-09-22 07:52:59