Caching & Messaging Infrastructure

Redis High Availability

18 min Lesson 2 of 30

Redis High Availability

A single Redis instance is a single point of failure. At production scale — millions of ops/sec, sub-millisecond SLOs, zero-downtime deploys — you need a topology that can survive node failures, network partitions, and rolling upgrades without manual intervention. Redis ships two complementary HA mechanisms: Sentinel for leader-election on a replicated primary/replica pair, and Redis Cluster for sharded, self-healing distributed state.

Redis Sentinel

Sentinel is a separate process (or set of processes) that monitors a primary and its replicas, decides when the primary is down, elects a new primary, and rewrites client configuration. It is the right choice when your dataset fits on a single node (typically under ~100 GB working set) and you want automatic failover without data sharding.

Topology: run an odd number of Sentinel instances (minimum 3, usually 3 or 5) spread across availability zones. Sentinels use a quorum-based vote: a failover proceeds only when quorum Sentinels agree the primary is unreachable, and the new primary is confirmed by a majority (floor(sentinels/2)+1). Running two Sentinels gives you no safety — a single Sentinel failure splits the quorum.

Redis Sentinel Topology AZ-1 AZ-2 AZ-3 Redis Primary :6379 Replica 1 :6379 Replica 2 :6379 async repl async repl Sentinel-1 :26379 Sentinel-2 :26379 Sentinel-3 :26379 Application Client SENTINEL get-master
Redis Sentinel topology across three availability zones: three Sentinels monitor Primary + 2 Replicas; the application resolves the current primary via any Sentinel before connecting.

A minimal Sentinel configuration file (/etc/redis/sentinel.conf) looks like this:

# /etc/redis/sentinel.conf — same file deployed to all three Sentinel nodes bind 0.0.0.0 port 26379 sentinel monitor mymaster 10.0.1.10 6379 2 # quorum = 2 sentinel down-after-milliseconds mymaster 5000 # 5 s to declare subjective down sentinel failover-timeout mymaster 60000 # 60 s total failover budget sentinel parallel-syncs mymaster 1 # only 1 replica syncs at a time during failover sentinel auth-pass mymaster s3cr3tP@ss requirepass s3cr3tR3nt1n3lP@ss # protect the Sentinel API itself # Notification hooks sentinel notification-script mymaster /opt/redis/notify.sh sentinel client-reconfig-script mymaster /opt/redis/reconfig.sh

After failover, Sentinel rewrites its own config file with the new primary address and broadcasts +switch-master on its Pub/Sub channel. Clients using the Sentinel-aware connection pattern (e.g., redis-py's Sentinel class or Jedis JedisSentinelPool) automatically re-resolve. Clients that hard-code the primary IP will break — this is the most common production outage pattern with Sentinel.

Split-brain in Sentinel: if min-replicas-to-write is not set on the primary, the old primary can continue accepting writes during the election window, causing divergent data. Always configure min-replicas-to-write 1 and min-replicas-max-lag 10 on the primary so it refuses writes when isolated.

Redis Cluster

Redis Cluster shards data across up to 1,000 nodes using 16,384 hash slots. Each key maps to a slot via CRC16(key) % 16384. A cluster of N primary shards owns contiguous slot ranges; each primary has one or more replicas. The cluster is self-healing: it detects node failures through gossip, promotes replicas automatically, and re-advertises slot ownership without external coordination.

The minimum viable production cluster is 6 nodes: 3 primaries + 3 replicas, each primary on a different AZ from its replica. A single-AZ failure loses no primaries.

Redis Cluster Shard Topology AZ-1 AZ-2 AZ-3 Shard A Primary slots 0 – 5460 Shard B Primary slots 5461 – 10922 Shard C Primary slots 10923 – 16383 Replica of A AZ-2 cross-AZ Replica of B AZ-3 cross-AZ Replica of C AZ-1 cross-AZ gossip gossip Client (cluster-aware) MOVED / ASKING redirects
Redis Cluster with 3 shards and cross-AZ replica placement: each primary is on a different AZ from its replica, ensuring a single AZ failure promotes no primary data loss.

Bootstrap a 6-node cluster (nodes pre-running with cluster-enabled yes):

# redis.conf snippet required on every node cluster-enabled yes cluster-config-file /var/lib/redis/nodes.conf cluster-node-timeout 15000 # ms; time before a node is considered failed cluster-require-full-coverage no # keep serving partial data on shard loss cluster-replica-validity-factor 10 # replica must be <= 10 * repl-timeout lag behind appendonly yes appendfsync everysec # Create the cluster (Redis 7+) redis-cli --cluster create \ 10.0.1.10:6379 10.0.1.11:6379 10.0.1.12:6379 \ 10.0.2.10:6379 10.0.2.11:6379 10.0.2.12:6379 \ --cluster-replicas 1 \ -a <password> # Verify slot distribution and health redis-cli -c -h 10.0.1.10 -a <password> cluster info redis-cli --cluster check 10.0.1.10:6379 -a <password>

Failover Behavior in Depth

In both topologies the failover timeline follows the same phases: subjective down → objective down → election → promotion → slot re-advertisement. Key operational parameters that determine RTO:

  • cluster-node-timeout (Cluster) / down-after-milliseconds (Sentinel): how long before a node is considered down. Lower = faster failover; too low = false positives during GC pauses or network jitter. 5–15 s is the industry norm.
  • Replica replication lag at the moment of primary failure determines how much data the promoted replica is missing. With default async replication, any writes since the last acknowledged offset are lost. Quantify this with INFO replicationmaster_repl_offset vs slave_repl_offset.
  • In Redis Cluster, clients receive a MOVED redirect (permanent slot move) or an ASKING redirect (slot mid-migration). Cluster-aware clients (Lettuce, redis-py with RedisCluster) handle these transparently; generic clients do not.
Production failover drill: run redis-cli DEBUG SLEEP 30 on the primary while watching redis-cli -h <sentinel> -p 26379 SENTINEL masters in another terminal. Measure actual failover duration and confirm clients reconnected. Schedule this quarterly — topology knowledge decays.

Sentinel vs Cluster: Choosing the Right Model

Dimension Sentinel Cluster
Data size Single node fits in RAM Horizontal sharding
Multi-key ops All operations supported Cross-slot ops need hash tags {}
Client requirement Sentinel-aware client Cluster-aware client
Ops complexity Lower Higher (resharding, slot migration)
Write throughput Single node ceiling (~500k ops/s) Scales linearly with shards
Managed Redis HA: ElastiCache (AWS), Cloud Memorystore (GCP), and Azure Cache for Redis all wrap Sentinel or Cluster internally. You still need to understand the underlying mechanics to configure node sizes, replica counts, Multi-AZ, and maintenance windows correctly — and to interpret the metrics they surface in CloudWatch or Stackdriver.

At Google and Meta scale, teams typically run Cluster at ≥ 9 shards (3 per AZ) with cluster-replicas 2 — two replicas per shard — so an AZ failure loses no primaries and no read replicas. Shard count is determined by peak write throughput and working-set size, not total data volume: Redis keys evicted by maxmemory-policy allkeys-lru are gone, so size the cluster so the hot working set fits in RAM with a 30% headroom buffer.