简体   繁体   中英

What is the idea behind skipping chars in the old impl of String hashCode() in Java

What is the idea of skipping some characters from a String in old versions of Java's String hashCode() implementation:

public int hashCode() {
   int hash = 0;
   int skip = Math.max(1, length()/8);
   for (int i = 0; i < length(); i += skip)
      hash = (hash * 37) + charAt(i);
   return hash;
}

In the current version there is no skipping and the prime number is 31 instead of 37

Probably to fast up the hashCode() computation but as consequence it had more potential collisions.
The new version favors less collisions but requires more computations.

But in the facts, String s are immutable, so in more recent versions of hashCode() , that is computed once :

public int hashCode() {
    int h = hash; 
    if (h == 0 && value.length > 0) {
        hash = h = isLatin1() ? StringLatin1.hashCode(value)
                              : StringUTF16.hashCode(value);
    }
    return h;
}

So in a some way it makes sense to favor this way as it reduces the collision number and not skipping some characters in the hashCode() computation is not so expensive as the result is cached.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM