简体   繁体   中英

Weakly consistent iterator by ConcurrentHashMap

The Java Concurrency in Practice mentions that:

The iterator returned by the ConcurrentHashMap are weakly consistent than fail-fast. A weakly consistent iterator can tolerate the concurrent modifications, traverses elements as they existed when the iterator was constructed, and may (but is not guaranteed to) reflect modifications to the collection after the construction of the iterator.

  1. How making the iterator weakly consistent or fail-safe helps in the concurrent environment because still state of the ConcurrentHashMap will be modified. The only thing is that it'll not throw the ConcurrentModificationException .
  2. Why fail-fast iterator is returned by the Collections when creating the fail-safe iterator is good for concurrency.

Correctness in your particular case

Please keep in mind that Fail Fast iterator iterates over the original collection.

In contrast Fail Safe (aka weakly consistent ) iterator iterates over a copy of the original collection. Therefore any changes to the original collection go unnoticed, and that's how it guarantees lack of ConcurrentModificationException s.


To answer your questions:

  1. Using Fail Safe iterator helps concurrency as you don't have to block on the reading threads on the whole collection. Collection can be modified underneath while the reading happens. The drawback is that the reading thread will see the state of the collection as a snapshot taken at the time when the iterator got created.
  2. If the above limitation is not good for your particular use case (your readers should always see the same state of the collection) you have to use Fail Fast iterator and keep the concurrent access to the collection controlled tighter.

As you can see it's a trade-off between correctness of your use case and speed.

ConcurrentHashMap

ConcurrentHashMap ( CHM ) exploits multiple tricks in order to increase concurrency of access.

  • Firstly CHM is actually a grouping of multiple maps; each MapEntry gets stored in one of the number of segments each itself being a hashtable which can be concurrently read ( read methods do not block).
  • The number of segments is the last argument in the 3 argument constructor and it is called concurrencyLevel (default 16 ). The number of segments determines the number of concurrent writers across the whole of the data. The equal spread of entries between the segments is ensured by additional internal hashing algorithm.
  • Each HashMapEntry s value is volatile thereby ensuring fine grain consistency for contended modifications and subsequent reads; each read reflects the most recently completed update
  • Iterators and Enumerations are Fail Safe - reflecting the state at some point since the creation of iterator/enumeration; this allows for simultaneous reads and modifications at the cost of reduced consistency.

TL;DR: Because locking.

If you want a consistent iterator, then you have to lock all modifications to the Map - this is a massive penalty in a concurrent environment.

You can of course do this manually if that is what you want, but iterating over a Map is not its purpose so the default behaviour allows for concurrent writes while iterating.

The same argument does not apply for normal collections, which are only (allowed to be) accessed by a single thread. Iteration over an ArrayList is expected to be consistent, so the fail fast iterators enforce consistency.

First of all, the iterators of concurrent collections are not fail-safe because they do not have failure modes which they could somehow handle with some kind of emergency procedure. They simply do not fail.

The iterators of the non-concurrent collections are fail-fast because of performance reasons they are designed in a way that does not allow the internal structure of the collection they iterate over to be modified. Eg a hashmap's iterator would not know how to continue iterating after the reshuffling that happens when a hashmap gets resized.

That means they would not just fail because other threads access them, they would also fail if the current thread performs a modification that invalidates the assumptions of the iterator.

Instead of ignoring those troublesome modifications and returning unpredictable and corrupted results those collections instead try to track modifications and throw an exception during iteration to inform the programmer that something is wrong. This is called fail-fast .

Those fail-fast mechanisms are not thread-safe either. Which means if the illegal modifications don't happen from the current thread but from a different threads they are not guaranteed to be detected anymore. In that case it can only be thought of as a best-effort failure detection mechanism.

On the other hand concurrent collections must be designed in a manner that can deal with multiple writes and reads at the same time and the underlying structure changing constantly.

So iterators can't always assume that the underlying structure is never modified during iteration.

Instead they're designed to provide weaker guarantees, such as either iterating over outdated data or maybe also showing some but not all updates that happened after the creation of the iterator. Which also means that they might return outdated data when they are modified during iteration within a single thread, which might be somewhat counter-intuitive for a programmer as one would usually expect immediate visibility of modifications within a single thread.

Examples:

HashMap : best-effort fail-fast iterator.

  • iterator supports removal
  • structural modification from same thread, such as clear() ing the Map during iteration: guaranteed to throw a ConcurrentModificationException on the next iterator step
  • structural modification from different thread during iteration: iterator usually throws an exception, but might also cause inconsistent, unpredictable behavior

CopyOnWriteArrayList : snapshot iterator

  • iterator does not support removal
  • iterator shows a view on the items frozen at the time it was created
  • collection can be modified by any thread including the current one during iteration without causing an exception, but it has no effect on the items visited by the iterator
  • clear() ing the list will not stop iteration
  • iterator never throws CME

ConcurrentSkipListMap : weakly consistent iterator

  • iterator supports removal, but may cause surprising behavior since it's solely based on Map keys, not the current value
  • iterator may see updates that happened since its creation but is not guaranteed to. that means for example that clear() ing the Map may or may not stop iteration and removing entries may or may not stop them from showing up during the remaining iteration
  • iterator never throws CME

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM