简体   繁体   中英

Binary arithmetic: why hash%n is equivalent to hash&(n-1)?

I have been studying Java HashMap source code, the part of it which decides in what bucket to put an object and saw this change in Java 7 (8) as compared to Java 6. Additionally I conducted numerous experiments and both expressions yeild the same result:

hash % n
and
hash & (n - 1)
where n - the array length that must be power of 2.

I just cannot figure out why is it true? Is there any theorem or some math laws that prove these statement are equal? Basically I want to understand the inference and prove the equivalence of those two statements.

PS. If n is not a power of 2 number, the equivalence breaks immedeately.

If n is a power of two that mean its binary representation is 10000.... ,
n-1 for that matter is 1111111... with one less digit.

That means that binary &-ing with (n-1) preserves just exactly the number of bits in k that n-1 has set.

Example n = 8: 1000, n-1 = 7: 111
&-ing for example k = 201: 11001001
k % n = k & (n-1) = 11001001 & 111 = 001 = 1 .

%-ing with a power of 2 means that in binary you just strip everything away that is above (including) the only set bit: for n = 8 that means stripping everything over (including) the 4th bit. And that is exactly what the &-ing does at well.


A side effect is that using & is commutative: hash & (n - 1) is equivalent to (n - 1) & hash which is not true for % , the jdk source code in many places uses the later, eg in getNode

Think about the bits in (n - 1) if n is a power of 2 (or ((1 << i) - 1) , if you want to simplify the constraint on n ):

If n is, say, 16 ( = 1 << 4) , then n - 1 is 15, and the bit representation of 15 and 16 (as 32-bit int s) are:

 1 = 00000000000000000000000000000001  // Shift by 4 to get...
16 = 00000000000000000000000000010000  // Subtract 1 to get...
15 = 00000000000000000000000000001111

So just the lowest 4 bits are set in 15. If you & this with another int, it will only allow bits in the last 4 bits of that number to be set in the result, so the value will only be in the range 0-15, so it's like doing % 16 .


However, note that this equivalence doesn't hold for a negative first operand:

    System.out.println(-1 % 2);    // -1
    System.out.println(-1 & (2-1));  //  1

Ideone demo

The arithmetic rule for integer / and % is:

x*(y/x) + (y%x) = y

What about a negative hash -4 and a positive n 8?

8*0 + (-4%8) = -4

Hence modulo maintains the sign.

-4 % 8 = -4
-4 & 7 = 4

Or:

int t = hash%n;
if (t < 0) {
   t += n;
}
assert t == (hash & (n-1));

So in the earlier java with %n hash had to be positive to begin with. Now hash may be negative, more solid and better hashing. So that was a sound reason for this subtle change in java source code.


Background:

2 n is a 1 followed by n-1 0 s (in binary). 2 n - 1 is n-1 1 s.

Hence for n being a positive power of 2, and some positive number h:

h % n == h & (n-1)

Another usage is to count bits in an int. The class Integer has just such a function.

int bits = 0;
while (x != 0) {
    x &= x - 1;
    ++bits;
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM