I have been studying Java HashMap source code, the part of it which decides in what bucket to put an object and saw this change in Java 7 (8) as compared to Java 6. Additionally I conducted numerous experiments and both expressions yeild the same result:
hash % n
and
hash & (n - 1)
where n - the array length that must be power of 2.
I just cannot figure out why is it true? Is there any theorem or some math laws that prove these statement are equal? Basically I want to understand the inference and prove the equivalence of those two statements.
PS. If n is not a power of 2 number, the equivalence breaks immedeately.
If n is a power of two that mean its binary representation is 10000....
,
n-1 for that matter is 1111111...
with one less digit.
That means that binary &-ing with (n-1)
preserves just exactly the number of bits in k
that n-1
has set.
Example n = 8: 1000, n-1 = 7: 111
&-ing for example k = 201: 11001001
k % n = k & (n-1) = 11001001 & 111 = 001 = 1
.
%-ing with a power of 2 means that in binary you just strip everything away that is above (including) the only set bit: for n = 8 that means stripping everything over (including) the 4th bit. And that is exactly what the &-ing does at well.
A side effect is that using &
is commutative: hash & (n - 1)
is equivalent to (n - 1) & hash
which is not true for %
, the jdk source code in many places uses the later, eg in getNode
Think about the bits in (n - 1)
if n
is a power of 2 (or ((1 << i) - 1)
, if you want to simplify the constraint on n
):
If n
is, say, 16 ( = 1 << 4)
, then n - 1
is 15, and the bit representation of 15
and 16
(as 32-bit int
s) are:
1 = 00000000000000000000000000000001 // Shift by 4 to get...
16 = 00000000000000000000000000010000 // Subtract 1 to get...
15 = 00000000000000000000000000001111
So just the lowest 4 bits are set in 15. If you &
this with another int, it will only allow bits in the last 4 bits of that number to be set in the result, so the value will only be in the range 0-15, so it's like doing % 16
.
However, note that this equivalence doesn't hold for a negative first operand:
System.out.println(-1 % 2); // -1
System.out.println(-1 & (2-1)); // 1
The arithmetic rule for integer /
and %
is:
x*(y/x) + (y%x) = y
What about a negative hash
-4 and a positive n
8?
8*0 + (-4%8) = -4
Hence modulo maintains the sign.
-4 % 8 = -4
-4 & 7 = 4
Or:
int t = hash%n;
if (t < 0) {
t += n;
}
assert t == (hash & (n-1));
So in the earlier java with %n
hash
had to be positive to begin with. Now hash may be negative, more solid and better hashing. So that was a sound reason for this subtle change in java source code.
Background:
2 n is a 1 followed by n-1 0 s (in binary). 2 n - 1 is n-1 1 s.
Hence for n being a positive power of 2, and some positive number h:
h % n == h & (n-1)
Another usage is to count bits in an int. The class Integer has just such a function.
int bits = 0;
while (x != 0) {
x &= x - 1;
++bits;
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.