简体   繁体   English

java中关于HashMap的实现

[英]Regarding HashMap implementation in java

I was trying to do research on hashmap and came up with the following analysis:我试图对 hashmap 进行研究并提出以下分析:

https://stackoverflow.com/questions/11596549/how-does-javas-hashmap-work-internally/18492835#18492835 https://stackoverflow.com/questions/11596549/how-does-javas-hashmap-work-internally/18492835#18492835

Q1 Can you guys show me a simple map where you can show the process..that how hashcode for the given key is calculated in detail by using this formula ..Calculate position hash % (arrayLength-1)) where element should be placed(bucket number), let say I have this hashMap Q1 你们能不能给我看一张简单的地图,你可以在其中展示过程..如何使用这个公式详细计算给定键的哈希码..计算位置哈希%(arrayLength-1))应该放置元素(桶号),假设我有这个 hashMap

HashMap map=new HashMap();//HashMap key random order.
         map.put("Amit","Java");
         map.put("Saral","J2EE");

Q2 Sometimes it might happen that hashCodes for 2 different objects are the same. Q2 有时可能会发生 2 个不同对象的 hashCode 相同的情况。 In this case 2 objects will be saved in one bucket and will be presented as LinkedList.在这种情况下,2 个对象将保存在一个存储桶中,并将显示为 LinkedList。 The entry point is more recently added object.入口点是最近添加的对象。 This object refers to other objest with next field and so one.这个对象是指具有下一个字段等的其他对象。 Last entry refers to null.最后一个条目指的是 null。 Can you guys show me this with real example..!!你们能用真实的例子告诉我这个吗..!!

. .

"Amit" will be distributed to the 10th bucket, because of the bit twiddeling. “Amit”将被分配到第 10 个桶中,因为有点麻烦。 If there were no bit twiddeling it would go to the 7th bucket, because 2044535 & 15 = 7. how this is possible please explanin detail the whole calculation..?如果没有一点点麻烦,它会进入第 7 个桶,因为 2044535 & 15 = 7。这怎么可能,请详细解释整个计算..?

Snapshots updated...快照已更新...

在此处输入图片说明

and the other image is ...另一个图像是......

在此处输入图片说明

that how hashcode for the given key is calculated in detail by using this formula如何使用此公式详细计算给定键的哈希码

In case of String this is calculated by String#hashCode();String情况下,这是通过String#hashCode();计算的String#hashCode(); which is implemented as follows:其实现如下:

 public int hashCode() {
    int h = hash;
        int len = count;
    if (h == 0 && len > 0) {
        int off = offset;
        char val[] = value;

            for (int i = 0; i < len; i++) {
                h = 31*h + val[off++];
            }
            hash = h;
        }
        return h;
    }

Basically following the equation in the java doc基本上遵循java doc中的方程

 hashcode = s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]

One interesting thing to note on this implementation is that String actually caches its hash code.在这个实现中需要注意的一件有趣的事情是String实际上缓存了它的哈希码。 It can do this, because String is immutable.它可以做到这一点,因为String是不可变的。

If I calculate the hashcode of the String "Amit", it will yield to this integer:如果我计算String “Amit”的哈希码,它将产生这个整数:

System.out.println("Amit".hashCode());
>     2044535

Let's get through a simple put to a map, but first we have to determine how the map is built.让我们通过一个简单的放置到地图,但首先我们必须确定地图是如何构建的。 The most interesting fact about a Java HashMap is that it always has 2^n buckets. Java HashMap最有趣的事实是它总是有 2^n 个桶。 So if you call it, the default number of buckets is 16, which is obviously 2^4.所以如果你调用它,桶的默认数量是16,显然是2^4。

Doing a put operation on this map, it will first get the hashcode of the key.在这个映射上做一个放置操作,它首先会得到键的哈希码。 There happens some fancy bit twiddeling on this hashcode to ensure that poor hash functions (especially those that do not differ in the lower bits) don't "overload" a single bucket.在这个哈希码上发生了一些花哨的位处理,以确保糟糕的哈希函数(尤其是那些在低位没有不同的函数)不会“超载”单个存储桶。

The real function that is actually responsible for distributing your key to the buckets is the following:实际负责将您的密钥分发到存储桶的真正功能如下:

 h & (length-1); // length is the current number of buckets, h the hashcode of the key

This only works for power of two bucket sizes, because it uses & to map the key to a bucket instead of a modulo.这仅适用于两个桶大小的幂,因为它使用 & 将键映射到桶而不是模数。

"Amit" will be distributed to the 10th bucket, because of the bit twiddeling. “Amit”将被分配到第 10 个桶中,因为有点麻烦。 If there were no bit twiddeling it would go to the 7th bucket, because 2044535 & 15 = 7 .如果没有一点乱动,它将进入第 7 个桶,因为2044535 & 15 = 7

Now that we have an index for it, we can find the bucket.现在我们有了它的索引,我们可以找到存储桶。 If the bucket contains elements, we have to iterate over them and replace an equal entry if we find it.如果桶包含元素,我们必须迭代它们并在找到它时替换相等的条目。 If none item has been found in the linked list we will just add it at the beginning of the linked list.如果在链表中没有找到任何项目,我们将把它添加到链表的开头。

The next important thing in HashMap is the resizing, so if the actual size of the map is above over a threshold (determined by the current number of buckets and the loadfactor, in our case 16*0.75=12) it will resize the backing array. HashMap的下一个重要事情是调整大小,因此如果地图的实际大小超过阈值(由当前的桶数和负载因子决定,在我们的例子中为 16*0.75=12),它将调整支持数组的大小. Resize is always 2 * the current number of buckets, which is guranteed to be a power of two to not break the function to find the buckets. Resize 总是 2 * 当前的桶数,保证是 2 的幂,不会破坏查找桶的功能。

Since the number of buckets change, we have to rehash all the current entries in our table.由于桶的数量发生变化,我们必须重新散列表中的所有当前条目。 This is quite costly, so if you know how many items there are, you should initialize the HashMap with that count so it does not have to resize the whole time.这是相当昂贵的,所以如果你知道有多少项,你应该用这个计数初始化HashMap ,这样它就不必一直调整大小。

Q1: look at hashCode() method implementation for String object Q1:查看String对象的hashCode()方法实现

Q2: Create simple class and implement its hashCode() method as return 1 . Q2:创建简单的类并将其hashCode()方法实现为return 1 That means each your object with that class will have the same hashCode and therefore will be saved in the same bucket in HashMap.这意味着具有该类的每个对象都将具有相同的 hashCode,因此将保存在 HashMap 中的同一存储桶中。

Understand that there are two basic requirements for a hash code:了解哈希码有两个基本要求:

  1. When the hash code is recalculated for a given object (that has not been changed internally in a way that would alter its identity) it must produce the same value as the previous calculation.当为给定对象重新计算散列码时(未在内部以会改变其身份的方式更改)它必须产生与先前计算相同的值。 Similarly, two "identical" objects must produce the same hash codes.同样,两个“相同”的对象必须产生相同的哈希码。
  2. When the hash code is calculated for two different objects (which are not considered "identical" from the standpoint of their internal content) there should be a high probability that the two hash codes would be different.当为两个不同的对象(从其内部内容的角度来看不被认为是“相同的”)计算哈希码时,两个哈希码很可能不同。

How these goals are accomplished is the subject of much interest to the math nerds who work on such things, but understanding the details is not at all important to understanding how hash tables work.如何实现这些目标是从事此类工作的数学书呆子们非常感兴趣的主题,但了解细节对于了解哈希表的工作原理并不重要。

import java.util.Arrays;
public class Test2 {
public static void main(String[] args) {
    Map<Integer, String> map = new Map<Integer, String>();
    map.put(1, "A");
    map.put(2, "B");
    map.put(3, "C");
    map.put(4, "D");
    map.put(5, "E");

    System.out.println("Iterate");
    for (int i = 0; i < map.size(); i++) {

        System.out.println(map.values()[i].getKey() + " : " + map.values()[i].getValue());
    }

    System.out.println("Get-> 3");
    System.out.println(map.get(3));

    System.out.println("Delete-> 3");
    map.delete(3);

    System.out.println("Iterate again");
    for (int i = 0; i < map.size(); i++) {

        System.out.println(map.values()[i].getKey() + " : " + map.values()[i].getValue());
    }
}

}

class Map<K, V> {

private int size;
private Entry<K, V>[] entries = new Entry[16];

public void put(K key, V value) {

    boolean flag = true;
    for (int i = 0; i < size; i++) {

        if (entries[i].getKey().equals(key)) {
            entries[i].setValue(value);
            flag = false;
            break;
        }
    }

    if (flag) {
        this.ensureCapacity();
        entries[size++] = new Entry<K, V>(key, value);
    }
}

public V get(K key) {

    V value = null;

    for (int i = 0; i < size; i++) {

        if (entries[i].getKey().equals(key)) {
            value = entries[i].getValue();
            break;
        }
    }
    return value;
}

public boolean delete(K key) {
    boolean flag = false;
    Entry<K, V>[] entry = new Entry[size];
    int j = 0;
    int total = size;
    for (int i = 0; i < total; i++) {

        if (!entries[i].getKey().equals(key)) {
            entry[j++] = entries[i];
        } else {
            flag = true;
            size--;
        }
    }

    entries = flag ? entry : entries;
    return flag;
}

public int size() {
    return size;
}

public Entry<K, V>[] values() {
    return entries;
}

private void ensureCapacity() {

    if (size == entries.length) {
        entries = Arrays.copyOf(entries, size * 2);
    }
}

@SuppressWarnings("hiding")
public class Entry<K, V> {

    private K key;
    private V value;

    public K getKey() {
        return key;
    }

    public V getValue() {
        return value;
    }

    public void setValue(V value) {
        this.value = value;
    }

    public Entry(K key, V value) {
        super();
        this.key = key;
        this.value = value;
    }

}
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM