Highly Scalable: Redis Cluster Summary

High Performance:
- Scales linearly with performance even with over 1,000 nodes.
- No proxy needed due to asynchronous replication.
- No value-based merge operations.
- Writes follow a best-effort approach—high reliability but not guaranteed.
High Availability:
- Service continues as long as a majority of masters are reachable.
- At least one accessible slave for an unreachable master is sufficient.
Feature Subset:
- Supports all single-key operations from standard Redis.
- Multi-key operations are supported only when keys reside on the same node (e.g., using hashtags to group keys).

Automatic Data Distribution:
- Redis Cluster distributes data across multiple nodes automatically.
- Service can persist even when some nodes go down.
Two Ports Required:
- Client Port (e.g., 6379).
- Cluster Bus Port (Client port + 10000), used for:
  - Node communication (binary protocol).
  - Failure detection, configuration updates, and failover authorization.
Data Sharding:
- Uses hash slots instead of consistent hashing.
- Total of 16384 hash slots.
- Hash calculated as CRC16(key) % 16384.
- Nodes manage subsets of these slots.
- Allows rebalancing (e.g., node addition/removal) without system downtime.
- Hash Tags can force multiple keys into the same hash slot for multi-key operations.
Failover:
- Each master node requires a slave.
- If a master fails, its slave is promoted to master.
- Cluster fails if both the master and its slave are unavailable.

Eventual Consistency:
- Writes are asynchronous (write-behind).
- Data can be lost if a master fails before propagation to its slaves.
- Strong consistency is not guaranteed.
WAIT Command:
- Forces synchronous replication to slaves.
- Not recommended as it does not provide ACID-level guarantees.
Partition Example:
- Suppose a cluster with 3 masters (A, B, C) and their respective slaves (A1, B1, C1). If a network partition isolates:
  - Master B and a client Z1 on one side.
  - Masters A, C, and slaves A1, B1, C1 on the other.
- Z1 writes to B, but if the partition persists long enough for B1 to be promoted on the majority side, the writes to B will be lost once the partition resolves.

Redis Cluster does not fit the strict definitions of CAP theorem:
- Consistency (CP): Limited by potential data loss in partitions.
- Availability (AP): Limited during network partitions.

Node Timeout:
- A master is considered failing if unresponsive for a duration specified by the node timeout configuration.
- Its replica will replace it as the new master.
Merge Strategy:
- Last Failover Wins:
  - The most recent failover determines the value.
  - Write operations during network partitions can be lost.

No Value Merge Overhead:
- Avoids maintaining metadata for value merging, saving memory overhead.
Unlimited Value Size:
- Large values are supported but at the cost of reduced stability during network partitions.
Real-World Pragmatism:
- Redis Cluster prioritizes practical business needs over strict adherence to theoretical distributed system limitations.

This approach strikes a balance between high performance and reasonable availability, making Redis Cluster suitable for many real-world applications.

Tuesday, June 9, 2015