简体   繁体   中英

Understanding the Implementation of HashTable in Java

I am trying to understand the implementation HashTables in Java. Below is my code:

    Hashtable<Integer, String> hTab = new Hashtable<Integer, String>();
    hTab.put(1, "A");
    hTab.put(1, "B");
    hTab.put(2, "C");
    hTab.put(3, "D");

            Iterator<Map.Entry<Integer, String>> itr = hTab.entrySet().iterator();
    Entry<Integer, String> entry;

    while(itr.hasNext()){
        entry = itr.next();
        System.out.println(entry.getValue());
    }

When I run it, I get the below output: DCB

Which means that there has been a collision for the Key = 1; and as per the implementation:

"Whenever a collision happens in the hashTable, a new node is created in the linkedList corresponding for the particular bucket and the EntrySet(Key, Value) pairs are stored as nodes in the list, the new value is inserted in the beginning of the list for the particular bucket". And I completely agree to this implementation.

But if this is true, then where did "A" go when I try to retrieve the entrysets from the hashTable?

Again, I tried with the below code to understand this by implementing my own HashCode and equals method. And surprisingly, this works perfect and as per the HashTable implementation. Below is my code:

public class Hash {

    private int key;

    public Hash(int key){
        this.key = key;
    }

    public int hashCode(){
        return key;
    }

    public boolean equals(Hash o){
        return this.key == o.key;

    }

    }

    public class HashTable1 {

    public static void main(String[] args) {
        // TODO Auto-generated method stub

        Hashtable<Hash, String> hTab = new Hashtable<Hash, String>();

        hTab.put(new Hash(1), "A");
        hTab.put(new Hash(1), "B");
        hTab.put(new Hash(2), "C");
        hTab.put(new Hash(3), "D");

        Iterator<Map.Entry<Hash, String>> itr = hTab.entrySet().iterator();
        Entry<Hash, String> entry;

        while(itr.hasNext()){
            entry = itr.next();
            System.out.println(entry.getValue());
        }
    }
}

Output : DCBA

Which is perfect. I am not able to understand this ambiguity in the behavior of HashTable in Java.

Update

@garrytan and @Brian: thanks for responding. But I still have a small doubt.

In my second code, where it works fine. I have created two objects which are new keys and since they are 2 objects, Key collision does not happens in this case and it works fine. I agree with your explanation. However, if in the first set of code I use "new Integer(1)" instead of simply "1", it still doesn't work although now I am creating 2 objects now and they should be different. I cross checked by writing the simple line below:

            Integer int1 = new Integer(1);
            Integer int2 = new Integer(1);
            System.out.println(int1 == int2);

which gives "False". it means now, the Key collision should have been resolved. But still it doesn't work. Why is this?

By design hashtable is not meant to store duplicate keys.

I think you get mixed up between 'hash collision' and 'key collision'. Put it simply, hash table consist of a collection of linked lists (ie: buckets). When you add a new key value pairs (KVPs), it is distributed into the buckets by the key's hash value. 'hash collision' happen when two keys result in the same hash (hence they get put into the same bucket)

A good hash function is one that distributes the key evenly into a number of buckets, hence improving key searching performance.

The second example gives the behaviour you want because your implementation of equals is incorrect.

The signature is

public boolean equals(Object o) {}

not

public boolean equals(Hash h) {}

So what you have created is a hash Collision, where two objects have the same hash code (key), but they are not equal according to the equals method (because your signature is wrong, it's still using the == operator and not your this.key == h.key code). As opposed to a key collision, where the objects both have the same hashCode and are also equals, as in your first example. If you fix the code in the second example to implement the actual equals(Object o) method you will see 'A' will again be missing from the values.

In your second example you are not overriding the original equals function because you use the following signature:

public boolean equals(Hash h) {}

Thus the original equals function with Object as a parameter is still used and as you create a new object Hash for each insert that Object is different from the other one and thus your keys for A and B are not equal.

Furthermore a HashTable is designed to have ONE value for EACH key. And keys are indeed relying on the equals functions to be compared.

About your example with two new Integers, try comparing them with .equals(). You could also override the hashCode function to generate different hashCodes or not for each object, ie depending on time, but that would be not a good coding principle. Objects which are the same should hash to the same code.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM