Tuesday, June 9, 2015

Redis Cluster Summary

Goals

  1. High Performance:

    • Scales linearly with performance even with over 1,000 nodes.
    • No proxy needed due to asynchronous replication.
    • No value-based merge operations.
    • Writes follow a best-effort approach—high reliability but not guaranteed.
  2. High Availability:

    • Service continues as long as a majority of masters are reachable.
    • At least one accessible slave for an unreachable master is sufficient.
  3. Feature Subset:

    • Supports all single-key operations from standard Redis.
    • Multi-key operations are supported only when keys reside on the same node (e.g., using hashtags to group keys).

Key Features

  1. Automatic Data Distribution:

    • Redis Cluster distributes data across multiple nodes automatically.
    • Service can persist even when some nodes go down.
  2. Two Ports Required:

    • Client Port (e.g., 6379).
    • Cluster Bus Port (Client port + 10000), used for:
      • Node communication (binary protocol).
      • Failure detection, configuration updates, and failover authorization.
  3. Data Sharding:

    • Uses hash slots instead of consistent hashing.
    • Total of 16384 hash slots.
    • Hash calculated as CRC16(key) % 16384.
    • Nodes manage subsets of these slots.
    • Allows rebalancing (e.g., node addition/removal) without system downtime.
    • Hash Tags can force multiple keys into the same hash slot for multi-key operations.
  4. Failover:

    • Each master node requires a slave.
    • If a master fails, its slave is promoted to master.
    • Cluster fails if both the master and its slave are unavailable.

Consistency

  1. Eventual Consistency:

    • Writes are asynchronous (write-behind).
    • Data can be lost if a master fails before propagation to its slaves.
    • Strong consistency is not guaranteed.
  2. WAIT Command:

    • Forces synchronous replication to slaves.
    • Not recommended as it does not provide ACID-level guarantees.
  3. Partition Example:

    • Suppose a cluster with 3 masters (A, B, C) and their respective slaves (A1, B1, C1). If a network partition isolates:
      • Master B and a client Z1 on one side.
      • Masters A, C, and slaves A1, B1, C1 on the other.
    • Z1 writes to B, but if the partition persists long enough for B1 to be promoted on the majority side, the writes to B will be lost once the partition resolves.

Trade-offs

  • Redis Cluster does not fit the strict definitions of CAP theorem:
    • Consistency (CP): Limited by potential data loss in partitions.
    • Availability (AP): Limited during network partitions.

Handling Failures

  1. Node Timeout:

    • A master is considered failing if unresponsive for a duration specified by the node timeout configuration.
    • Its replica will replace it as the new master.
  2. Merge Strategy:

    • Last Failover Wins:
      • The most recent failover determines the value.
      • Write operations during network partitions can be lost.

Design Considerations

  • No Value Merge Overhead:
    • Avoids maintaining metadata for value merging, saving memory overhead.
  • Unlimited Value Size:
    • Large values are supported but at the cost of reduced stability during network partitions.
  • Real-World Pragmatism:
    • Redis Cluster prioritizes practical business needs over strict adherence to theoretical distributed system limitations.

This approach strikes a balance between high performance and reasonable availability, making Redis Cluster suitable for many real-world applications.