简体   繁体   English

Java-当字符串较大时,hashCode()函数如何输出较小(或负数)的数字

[英]Java - How can hashCode() function output small (or negative) number when string is big

I made this function and it works the same as Original Java function when you input something short, but if i input something larger than 5-7 characters - then I get some realy big number. 我做了这个函数,当您输入简短的内容时,它的功能与原始Java函数相同,但是如果我输入的内容大于5-7个字符-那么我会得到一些真正的大数字。 (And not the right hashcode) (而不是正确的哈希码)

Here is the formula of Java's hash function: 这是Java哈希函数的公式:

s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]

Simplier one (Only works for short strings): 简化器一(仅适用于短字符串):

s = "abc" //String
n = 3 //Lenght of the String
s[0] = 'a'. ASCII code of 'a' = 97.
97 * (31 ^ (n - 1))
97 * (31 ^ (2))
97 * 961 = 93217

s[1] = 'b'. ASCII code of 'b' = 98.
98 * (31 ^ (n - 2))
98 * (31 ^ 1)
98 * 31 = 3038

s[2] = 'c'. ASCII code of 'c' = 99.
99 * (31 ^ (n - 3))
99 * (31 ^ 0)
99 * 1 = 99

93217 + 3038 + 99 = 96354 //

I want to know how does Java makes hash small even when I enter a huge string. 我想知道即使输入巨大的字符串,Java如何使哈希变小。

Java's hashcode of "Hello" - 69609650
My hashcode of "Hello" - 69609650


Java's hashcode of "Welcome to Tutorialspoint.com" - 1186874997
My hashcode of "Welcome to Tutorialspoint.com" - 5.17809991536626e+43

Also how can hash be negative if we add up numbers ? 如果我们将数字加起来,哈希怎么能为负数呢?

I suspect your implementation (which you haven't shown) uses BigInteger or something similar. 我怀疑您的实现(未显示)使用BigInteger或类似的东西。 Java just uses int - so when it overflows the range of positive 31-bit integers, it goes into large negative integers, and then as you add more (positive) values, you'll end up with small negative integers, then small positive integers, then large positive integers - and back to large negative integers. Java仅使用int因此,当它溢出31位正整数的范围时,它会变成大的负整数,然后当您添加更多(正)值时,最终会得到小的负整数,然后是小的正整数,然后是大的正整数-然后返回大的负整数。

String's hashCode involves only int addition and multiplication, so it results in an int , which may overflow (hence the negative values). 字符串的hashCode仅涉及int加法和乘法,因此它导致一个int ,该int可能会溢出(因此为负值)。

public int hashCode() {
    int h = hash;
    int len = count;
    if (h == 0 && len > 0) {
        int off = offset;
        char val[] = value;
        for (int i = 0; i < len; i++) {
            h = 31*h + val[off++];
        }
        hash = h;
    }
    return h;
}

Based on your 5.17809991536626e+43 value, it looks like you are doing floating point calculations (perhaps you are using Math.pow() which returns a double ), which give different results for large numbers. 根据您的5.17809991536626e+43值,您似乎在进行浮点计算(也许您正在使用Math.pow()返回double ),这将为大数提供不同的结果。

Source code for String$hashCode() : String$hashCode()源代码:

 1494       public int hashCode() {
 1495           int h = hash;
 1496           if (h == 0 && count > 0) {
 1497               int off = offset;
 1498               char val[] = value;
 1499               int len = count;
 1500   
 1501               for (int i = 0; i < len; i++) {
 1502                   h = 31*h + val[off++];
 1503               }
 1504               hash = h;
 1505           }
 1506           return h;
 1507       }

int is a signed integer on 4 bytes and it will just overflow during the hash computation, yielding a value that can be negative but is always bound by int . int是4个字节上的有符号整数,它只会在哈希计算期间溢出,产生的值可以为负,但始终受int约束。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM