简体   繁体   中英

What will happen in HashMap , if we put an element while rehashing is happening?

I want to know when resizing or rehashing is happening what will happen if we try to put an element in the map .Does it goes to new increased map or the old map.

And also what is the use of the extra free space in hashmap ,which is 25% of original map size as load factor is 75%?

Perhaps this needs a coherent answer.

I want to know when resizing or rehashing is happening what will happen if we try to put an element in the map.

This question only makes sense if you have two or more threads performing operations on the HashMap . If you are doing that your code is not thread-safe. Its behavior is unspecified, version specific, and you are likely to have bad things happen at unpredictable times. Things like entries being lost, inexplicable NPEs, even one of your threads going into an infinite loop.

You should not write code where two or more threads operate on a HashMap without appropriate external synchronization to avoid simultaneous operations. If you do, I cannot tell you what will happen.

If you only have one thread using the HashMap , then the scenario you are concerned about is impossible. The resize happens during an update operation.

If you have multiple threads and synchronize to prevent any simultaneous operations, then the scenario you are concerned about is impossible. The other alternative is to use ConcurrentHashMap which is designed to work correctly when multiple threads can red and write simultaneously. (Naturally, the code for resizing a ConcurrentHashMap is a lot more complicated. But it ensures that entries end up in the right place.)

Does it goes to new increased map or the old map.

Assuming you are talking about the multi-threaded non-synchronized case, the answer is unspecified and possibly version specific. (I haven't checked the code.) For the other cases, the scenario is impossible.


And also what is the use of the extra free space in hashmap ,which is 25% of original map size as load factor is 75%?

It is not used. If the load factor is 75%, at least 25% of the hash slots will be empty / never used. (Until you hit the point where the hash array cannot be expanded any further for architectural reasons. But you will rarely reach that point.)

This is a performance trade-off. The Sun engineers determined / judged that a load factor of 75% would give the best trade-off between memory used and time taken to perform operations on a HashMap . As you increase the load factor, the space utilization gets better but most operations on the HashMap get slower, because the average length of a hash chain increases.

You are free to use a different load factor value if you want. Just be aware of the potential consequences.

Resizing and multithreading

If you access the hash map from a single thread, then it cannot happen. Resizing is triggered not by a timer, but by an operation that changes the number of elements in the hash map, eg it is triggered by put() operation. If you call put() and hash map sees that resizing is needed, it will perform resizing, then it will your new element. Means, the new element will be added after resizing, no element will be lost, there will be inconsistent behaviour in any of the methods.

Buf if access your hash map from multiple threads, then there can be many sorts of problems. For instance, if two threads call put() at the same time, both can trigger resizing. One of consequences can be that the new element of one of the threads will be lost. Even if resizing is not needed, multithreading can lead to loosing of some elements. For instance, two threads generate the same bucket index, and there is no such bucket yet. Both threads create such bucket and add it to the array of buckets. But the most recent wins, the other one will be overridden.

It is nothing specific to hash map. It is a typical problem when you modify object by multiple threads. To handle hash maps in multithreading environment correctly, you can either implement synchronization or use a class that is already thread safe, ConcurrentHashMap .

Load factor

Elements in hash map are stored in buckets. If each hash corresponds to a single bucket index, then the access time is O(1) . The more hashes you have, the higher is the probability that two hashes produce the same bucket index. Then they will be stored in the same bucket, and the access time will increase .

One solution to reduce such collisions is to use another hash function. But 1) designing of hash functions that fit particular requirements can be very non trivial task (besides reducing collisions, it should provide acceptable performance), and 2) you can improve hash only in your own classes, but not in libraries you use.

Another, more simple solution, is to use a bigger number of buckets for the same number of hashes. When you reduce the relation (number of hashes) / (number of buckets) , then you reduce the probability of collisions and thus keep access time close to O(1) . But the price is, that you need more memory. For instance, for load factor 75% the 25% of the bucket array are not used; for 10% load factor the 90% will be not used.

There is no solution that fits all cases. Try different values and measure performance and memory usage, and then decide what is better in your case.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM