Monday, June 2, 2014

Java ConcurrentHashMap Overview

Background:

ConcurrentHashMap was introduced in Java 1.5 as an alternative to Hashtable. Prior to Java 1.5, if you needed a thread-safe map, you had to use either Hashtable or synchronizedMap because HashMap was not thread-safe.

ConcurrentHashMap provides better performance and stability in concurrent multi-threaded environments compared to Hashtable and synchronizedMap. This is because ConcurrentHashMap only locks portions of the map at a time, while Hashtable and synchronizedMap lock the entire map for each operation.

How It Works:

ConcurrentHashMap divides the map into multiple segments, where each segment has its own lock. The number of segments is determined by the concurrency level, which is set to 16 by default. This allows multiple threads to operate concurrently on different parts of the map. For example, up to 16 threads can work on different segments simultaneously without blocking each other.

However, some operations like put(), remove(), putAll(), and clear() are still performed synchronously, meaning that updates are not reflected in real-time during these operations. Additionally, the keySet iterator is periodically synchronized and reflects the state of the map at specific points in time, so recent changes may not be immediately visible.

When to Use ConcurrentHashMap:

ConcurrentHashMap is best suited for situations where reads are much more frequent than writes. If writes are more frequent or are close to the number of reads, its performance can degrade to be similar to that of synchronizedMap or Hashtable.

Since ConcurrentHashMap can be initialized while being accessed by multiple threads, it is ideal for cache-like applications where many threads may need to read from the map concurrently. It is also recommended as a replacement for Hashtable and is mentioned in the Javadoc for the standard Java library.

Differences Between ConcurrentHashMap and Hashtable:

Both ConcurrentHashMap and Hashtable are thread-safe, but there are important differences:

  1. Locking Behavior: In Hashtable, when iterating over a large map, the entire map must be locked, which can significantly degrade performance. In contrast, ConcurrentHashMap locks only individual segments, allowing more concurrency.

  2. Iteration: ConcurrentHashMap provides fail-safe iteration, meaning that the iterator does not throw ConcurrentModificationException even if the map is modified during iteration. However, the iterator may not reflect the most recent updates.

  3. Performance: ConcurrentHashMap offers better performance in high-concurrency scenarios by allowing multiple threads to operate on different segments of the map concurrently.

Handling Conditional Operations:

In a typical HashMap, you might use code like this to check if a key exists and then insert a value:

if (map.get(key) == null) { return map.put(key, value); } else { return map.get(key); }

In ConcurrentHashMap, this approach might not behave as expected because it does not lock the entire map when performing the put operation. During the put operation, another thread may call get(), potentially returning a null value even if the map is concurrently modified by another thread.

To avoid such issues, ConcurrentHashMap provides a putIfAbsent() method, which atomically checks if a key is absent and then inserts the value. This eliminates race conditions and ensures the desired behavior without needing to synchronize the entire block:

map.putIfAbsent(key, value);

This method ensures that the key is only added if it is not already present, providing thread-safe conditional insertion.

Conclusion:

  • ConcurrentHashMap is an excellent choice for concurrent access in multi-threaded applications, particularly when reads far outweigh writes.
  • Performance is improved because it locks only segments of the map, allowing multiple threads to operate concurrently without blocking each other.
  • It is recommended as a replacement for Hashtable in most scenarios, providing better scalability and thread safety in concurrent environments.
  • For conditional operations like putIfAbsent(), ConcurrentHashMap provides a thread-safe, efficient alternative to synchronizedMap or Hashtable.