Redis High Availability
Redis High Availability
A single Redis instance is a single point of failure. At production scale — millions of ops/sec, sub-millisecond SLOs, zero-downtime deploys — you need a topology that can survive node failures, network partitions, and rolling upgrades without manual intervention. Redis ships two complementary HA mechanisms: Sentinel for leader-election on a replicated primary/replica pair, and Redis Cluster for sharded, self-healing distributed state.
Redis Sentinel
Sentinel is a separate process (or set of processes) that monitors a primary and its replicas, decides when the primary is down, elects a new primary, and rewrites client configuration. It is the right choice when your dataset fits on a single node (typically under ~100 GB working set) and you want automatic failover without data sharding.
Topology: run an odd number of Sentinel instances (minimum 3, usually 3 or 5) spread across availability zones. Sentinels use a quorum-based vote: a failover proceeds only when quorum Sentinels agree the primary is unreachable, and the new primary is confirmed by a majority (floor(sentinels/2)+1). Running two Sentinels gives you no safety — a single Sentinel failure splits the quorum.
A minimal Sentinel configuration file (/etc/redis/sentinel.conf) looks like this:
After failover, Sentinel rewrites its own config file with the new primary address and broadcasts +switch-master on its Pub/Sub channel. Clients using the Sentinel-aware connection pattern (e.g., redis-py's Sentinel class or Jedis JedisSentinelPool) automatically re-resolve. Clients that hard-code the primary IP will break — this is the most common production outage pattern with Sentinel.
min-replicas-to-write is not set on the primary, the old primary can continue accepting writes during the election window, causing divergent data. Always configure min-replicas-to-write 1 and min-replicas-max-lag 10 on the primary so it refuses writes when isolated.
Redis Cluster
Redis Cluster shards data across up to 1,000 nodes using 16,384 hash slots. Each key maps to a slot via CRC16(key) % 16384. A cluster of N primary shards owns contiguous slot ranges; each primary has one or more replicas. The cluster is self-healing: it detects node failures through gossip, promotes replicas automatically, and re-advertises slot ownership without external coordination.
The minimum viable production cluster is 6 nodes: 3 primaries + 3 replicas, each primary on a different AZ from its replica. A single-AZ failure loses no primaries.
Bootstrap a 6-node cluster (nodes pre-running with cluster-enabled yes):
Failover Behavior in Depth
In both topologies the failover timeline follows the same phases: subjective down → objective down → election → promotion → slot re-advertisement. Key operational parameters that determine RTO:
cluster-node-timeout(Cluster) /down-after-milliseconds(Sentinel): how long before a node is considered down. Lower = faster failover; too low = false positives during GC pauses or network jitter. 5–15 s is the industry norm.- Replica replication lag at the moment of primary failure determines how much data the promoted replica is missing. With default async replication, any writes since the last acknowledged offset are lost. Quantify this with
INFO replication→master_repl_offsetvsslave_repl_offset. - In Redis Cluster, clients receive a
MOVEDredirect (permanent slot move) or anASKINGredirect (slot mid-migration). Cluster-aware clients (Lettuce, redis-py withRedisCluster) handle these transparently; generic clients do not.
redis-cli DEBUG SLEEP 30 on the primary while watching redis-cli -h <sentinel> -p 26379 SENTINEL masters in another terminal. Measure actual failover duration and confirm clients reconnected. Schedule this quarterly — topology knowledge decays.
Sentinel vs Cluster: Choosing the Right Model
| Dimension | Sentinel | Cluster |
|---|---|---|
| Data size | Single node fits in RAM | Horizontal sharding |
| Multi-key ops | All operations supported | Cross-slot ops need hash tags {} |
| Client requirement | Sentinel-aware client | Cluster-aware client |
| Ops complexity | Lower | Higher (resharding, slot migration) |
| Write throughput | Single node ceiling (~500k ops/s) | Scales linearly with shards |
At Google and Meta scale, teams typically run Cluster at ≥ 9 shards (3 per AZ) with cluster-replicas 2 — two replicas per shard — so an AZ failure loses no primaries and no read replicas. Shard count is determined by peak write throughput and working-set size, not total data volume: Redis keys evicted by maxmemory-policy allkeys-lru are gone, so size the cluster so the hot working set fits in RAM with a 30% headroom buffer.