Goals
High Performance:
- Scales linearly with performance even with over 1,000 nodes.
- No proxy needed due to asynchronous replication.
- No value-based merge operations.
- Writes follow a best-effort approach—high reliability but not guaranteed.
High Availability:
- Service continues as long as a majority of masters are reachable.
- At least one accessible slave for an unreachable master is sufficient.
Feature Subset:
- Supports all single-key operations from standard Redis.
- Multi-key operations are supported only when keys reside on the same node (e.g., using hashtags to group keys).
Key Features
Automatic Data Distribution:
- Redis Cluster distributes data across multiple nodes automatically.
- Service can persist even when some nodes go down.
Two Ports Required:
- Client Port (e.g.,
6379
). - Cluster Bus Port (Client port + 10000), used for:
- Node communication (binary protocol).
- Failure detection, configuration updates, and failover authorization.
- Client Port (e.g.,
Data Sharding:
- Uses hash slots instead of consistent hashing.
- Total of 16384 hash slots.
- Hash calculated as
CRC16(key) % 16384
. - Nodes manage subsets of these slots.
- Allows rebalancing (e.g., node addition/removal) without system downtime.
- Hash Tags can force multiple keys into the same hash slot for multi-key operations.
Failover:
- Each master node requires a slave.
- If a master fails, its slave is promoted to master.
- Cluster fails if both the master and its slave are unavailable.
Consistency
Eventual Consistency:
- Writes are asynchronous (write-behind).
- Data can be lost if a master fails before propagation to its slaves.
- Strong consistency is not guaranteed.
WAIT Command:
- Forces synchronous replication to slaves.
- Not recommended as it does not provide ACID-level guarantees.
Partition Example:
- Suppose a cluster with 3 masters (A, B, C) and their respective slaves (A1, B1, C1). If a network partition isolates:
- Master
B
and a clientZ1
on one side. - Masters
A
,C
, and slavesA1
,B1
,C1
on the other.
- Master
Z1
writes toB
, but if the partition persists long enough forB1
to be promoted on the majority side, the writes toB
will be lost once the partition resolves.
- Suppose a cluster with 3 masters (A, B, C) and their respective slaves (A1, B1, C1). If a network partition isolates:
Trade-offs
- Redis Cluster does not fit the strict definitions of CAP theorem:
- Consistency (CP): Limited by potential data loss in partitions.
- Availability (AP): Limited during network partitions.
Handling Failures
Node Timeout:
- A master is considered failing if unresponsive for a duration specified by the
node timeout
configuration. - Its replica will replace it as the new master.
- A master is considered failing if unresponsive for a duration specified by the
Merge Strategy:
Last Failover Wins
:- The most recent failover determines the value.
- Write operations during network partitions can be lost.
Design Considerations
- No Value Merge Overhead:
- Avoids maintaining metadata for value merging, saving memory overhead.
- Unlimited Value Size:
- Large values are supported but at the cost of reduced stability during network partitions.
- Real-World Pragmatism:
- Redis Cluster prioritizes practical business needs over strict adherence to theoretical distributed system limitations.
This approach strikes a balance between high performance and reasonable availability, making Redis Cluster suitable for many real-world applications.