简体   繁体   中英

Queries regarding the implementation details of java.util.Hashtable

I have the following queries with regard to how the java.util.Hashtable is implemented. These are low level queries and not related to the usage of Hashtable but only with how the designers have chosen to implement the data structure

  • The Hashtable is created with a default size of 11 buckets. What is special about 11? Why not 10? Although I am inclined to think that this is just a magic number, I think not as well
  • To compute the bucket number, why do we not directly use the hashcode of the passed in key object. In the implementation we actually compute the bucket number as (hashcode & 7FFFFFFF) % table size where hashcode is the returned value for the input key and table size is 11 by default. Why are we rehashing the hashcode itself? Couldn't it have been just hashcode % table size ?
  • The contains(Object value) method searches for the presence of the value in the hashtable. For this we sequentially search from the last bucket and move towards the first bucket. Is this just a developer style adopted? The hashtable is just an array of linked lists. More intuitively I expected the search to move from the first bucket onto the last bucket, but found it otherwise. I understand that functionally both are the same. But any other reason?
  • The maximum array size (used during the rehash) is set to Integer.MAX - 8. What's the significance of 8 here?
  1. It appears to be an empirical value involving a tradeoff between too much space used and time-wasting rehashing operations. Hashtable javadocs :

The initial capacity controls a tradeoff between wasted space and the need for rehash operations, which are time-consuming. No rehash operations will ever occur if the initial capacity is greater than the maximum number of entries the Hashtable will contain divided by its load factor. However, setting the initial capacity too high can waste space.

  1. The value is is bitmasked with 0x7FFFFFFF to remove the first bit that would make the value negative. This forces the value to be non-negative, so that the resulting index after the % operation will also be non-negative. This is necessary to produce a viable index into the internal bucket array.

  2. It's possible that this was done to increase performance slightly. This article claims that looping backwards does exactly that.

The result show there's not much different between forward and reverse looping in 1 million of data. However when data grow huge, the performance of reverse looping is slightly faster than forward looping around 15%.

I don't know if that's really true, but that may have been the motivation.

  1. The source code I have reveals Javadocs on the private constant used for the maximum array size.
/**
 * The maximum size of array to allocate.
 * Some VMs reserve some header words in an array.
 * Attempts to allocate larger arrays may result in
 * OutOfMemoryError: Requested array size exceeds VM limit
 */
private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;

I don't know how valid this is now, but this was an attempt to avoid unexpected OutOfMemoryError s.

#1: AndreyS answered in the comments that: Why initialCapacity of Hashtable is 11 while the DEFAULT_INITIAL_CAPACITY in HashMap is 16 and requires a power of 2

#2: to make sure the number is positive before computing the modulus. Otherwise the outcome may be negative and we'll be out of bounds.

#3: I have to guess that when reverse looping you only evaluate length once, and compare to a constant (0), and in regular loops you compare to a variable. I don't know if that's what they had in their mind but it can be a consideration.

#4: To avoid integer overflow in rehash():

int i = table.length;
Entry[] arrayOfEntry1 = table;
int j = (i << 1) + 1;
if (j - 2147483639 > 0)
{
  if (i == 2147483639) {
    return;
  }
  j = 2147483639;
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM