Caching & Messaging Infrastructure

Redis High Availability

18 min Lesson 2 of 30

Redis High Availability

A single Redis instance is a single point of failure. At production scale — millions of ops/sec, sub-millisecond SLOs, zero-downtime deploys — you need a topology that can survive node failures, network partitions, and rolling upgrades without manual intervention. Redis ships two complementary HA mechanisms: Sentinel for leader-election on a replicated primary/replica pair, and Redis Cluster for sharded, self-healing distributed state.

Redis Sentinel

Sentinel is a separate process (or set of processes) that monitors a primary and its replicas, decides when the primary is down, elects a new primary, and rewrites client configuration. It is the right choice when your dataset fits on a single node (typically under ~100 GB working set) and you want automatic failover without data sharding.

Topology: run an odd number of Sentinel instances (minimum 3, usually 3 or 5) spread across availability zones. Sentinels use a quorum-based vote: a failover proceeds only when quorum Sentinels agree the primary is unreachable, and the new primary is confirmed by a majority (floor(sentinels/2)+1). Running two Sentinels gives you no safety — a single Sentinel failure splits the quorum.

Redis Sentinel topology across three availability zones: three Sentinels monitor Primary + 2 Replicas; the application resolves the current primary via any Sentinel before connecting.

A minimal Sentinel configuration file (/etc/redis/sentinel.conf) looks like this:

# /etc/redis/sentinel.conf  — same file deployed to all three Sentinel nodes
bind 0.0.0.0
port 26379

sentinel monitor mymaster 10.0.1.10 6379 2        # quorum = 2
sentinel down-after-milliseconds mymaster 5000     # 5 s to declare subjective down
sentinel failover-timeout mymaster 60000           # 60 s total failover budget
sentinel parallel-syncs mymaster 1                 # only 1 replica syncs at a time during failover

sentinel auth-pass mymaster s3cr3tP@ss
requirepass s3cr3tR3nt1n3lP@ss                     # protect the Sentinel API itself

# Notification hooks
sentinel notification-script mymaster /opt/redis/notify.sh
sentinel client-reconfig-script mymaster /opt/redis/reconfig.sh

After failover, Sentinel rewrites its own config file with the new primary address and broadcasts +switch-master on its Pub/Sub channel. Clients using the Sentinel-aware connection pattern (e.g., redis-py's Sentinel class or Jedis JedisSentinelPool) automatically re-resolve. Clients that hard-code the primary IP will break — this is the most common production outage pattern with Sentinel.

Split-brain in Sentinel: if min-replicas-to-write is not set on the primary, the old primary can continue accepting writes during the election window, causing divergent data. Always configure min-replicas-to-write 1 and min-replicas-max-lag 10 on the primary so it refuses writes when isolated.

Redis Cluster

Redis Cluster shards data across up to 1,000 nodes using 16,384 hash slots. Each key maps to a slot via CRC16(key) % 16384. A cluster of N primary shards owns contiguous slot ranges; each primary has one or more replicas. The cluster is self-healing: it detects node failures through gossip, promotes replicas automatically, and re-advertises slot ownership without external coordination.

The minimum viable production cluster is 6 nodes: 3 primaries + 3 replicas, each primary on a different AZ from its replica. A single-AZ failure loses no primaries.

Redis Cluster with 3 shards and cross-AZ replica placement: each primary is on a different AZ from its replica, ensuring a single AZ failure promotes no primary data loss.

Bootstrap a 6-node cluster (nodes pre-running with cluster-enabled yes):

# redis.conf snippet required on every node
cluster-enabled yes
cluster-config-file /var/lib/redis/nodes.conf
cluster-node-timeout 15000                  # ms; time before a node is considered failed
cluster-require-full-coverage no            # keep serving partial data on shard loss
cluster-replica-validity-factor 10          # replica must be <= 10 * repl-timeout lag behind
appendonly yes
appendfsync everysec

# Create the cluster (Redis 7+)
redis-cli --cluster create \
  10.0.1.10:6379 10.0.1.11:6379 10.0.1.12:6379 \
  10.0.2.10:6379 10.0.2.11:6379 10.0.2.12:6379 \
  --cluster-replicas 1 \
  -a <password>

# Verify slot distribution and health
redis-cli -c -h 10.0.1.10 -a <password> cluster info
redis-cli --cluster check 10.0.1.10:6379 -a <password>

Failover Behavior in Depth

In both topologies the failover timeline follows the same phases: subjective down → objective down → election → promotion → slot re-advertisement. Key operational parameters that determine RTO:

cluster-node-timeout (Cluster) / down-after-milliseconds (Sentinel): how long before a node is considered down. Lower = faster failover; too low = false positives during GC pauses or network jitter. 5–15 s is the industry norm.
Replica replication lag at the moment of primary failure determines how much data the promoted replica is missing. With default async replication, any writes since the last acknowledged offset are lost. Quantify this with INFO replication → master_repl_offset vs slave_repl_offset.
In Redis Cluster, clients receive a MOVED redirect (permanent slot move) or an ASKING redirect (slot mid-migration). Cluster-aware clients (Lettuce, redis-py with RedisCluster) handle these transparently; generic clients do not.

Production failover drill: run redis-cli DEBUG SLEEP 30 on the primary while watching redis-cli -h <sentinel> -p 26379 SENTINEL masters in another terminal. Measure actual failover duration and confirm clients reconnected. Schedule this quarterly — topology knowledge decays.

Sentinel vs Cluster: Choosing the Right Model

Dimension	Sentinel	Cluster
Data size	Single node fits in RAM	Horizontal sharding
Multi-key ops	All operations supported	Cross-slot ops need hash tags `{}`
Client requirement	Sentinel-aware client	Cluster-aware client
Ops complexity	Lower	Higher (resharding, slot migration)
Write throughput	Single node ceiling (~500k ops/s)	Scales linearly with shards

Managed Redis HA: ElastiCache (AWS), Cloud Memorystore (GCP), and Azure Cache for Redis all wrap Sentinel or Cluster internally. You still need to understand the underlying mechanics to configure node sizes, replica counts, Multi-AZ, and maintenance windows correctly — and to interpret the metrics they surface in CloudWatch or Stackdriver.

At Google and Meta scale, teams typically run Cluster at ≥ 9 shards (3 per AZ) with cluster-replicas 2 — two replicas per shard — so an AZ failure loses no primaries and no read replicas. Shard count is determined by peak write throughput and working-set size, not total data volume: Redis keys evicted by maxmemory-policy allkeys-lru are gone, so size the cluster so the hot working set fits in RAM with a 30% headroom buffer.