Redis & Advanced Caching

Redis Clustering

20 min Lesson 20 of 30

Redis Clustering

Redis Cluster provides horizontal scaling and high availability by distributing data across multiple Redis nodes. It automatically handles data sharding, replication, and failover without a single point of failure.

Why Use Redis Cluster?

As your application grows, a single Redis instance may not handle the load. Redis Cluster solves this by spreading data and traffic across multiple servers.

Benefits: Horizontal scalability (add more nodes), automatic sharding (data distribution), high availability (automatic failover), no single point of failure.

Cluster Architecture

Redis Cluster uses a masterless architecture where every node knows about every other node. Clients can connect to any node.

Redis Cluster Structure:\n\nMaster 1 (slots 0-5460)     Master 2 (slots 5461-10922)     Master 3 (slots 10923-16383)\n    |                              |                                  |\n  Replica 1                      Replica 2                         Replica 3\n\n- 16,384 hash slots total\n- Each master handles a range of slots\n- Replicas provide redundancy\n- Automatic failover when master fails

Hash Slots

Redis Cluster divides the key space into 16,384 hash slots. Each key is mapped to a slot using CRC16 hashing.

// How keys are mapped to slots:\nslot = CRC16(key) % 16384\n\n// Examples:\nCRC16(\"user:1000\") % 16384 = 5478  // Stored on Master 1\nCRC16(\"product:5\") % 16384 = 12890 // Stored on Master 3\n\n// Hash tags for same-slot storage:\n// Use {tag} to force keys to same slot\n\"user:{1000}:profile\"  // All keys with {1000} go to same slot\n\"user:{1000}:orders\"\n\"user:{1000}:cart\"\n\n// This enables multi-key operations:\nMGET user:{1000}:profile user:{1000}:orders

Tip: Use hash tags {tag} when you need to perform multi-key operations like MGET or transactions on related keys.

Cluster Configuration

Setting up a minimal Redis Cluster with 3 masters and 3 replicas requires 6 nodes.

# redis-node-1.conf (Master 1)\nport 7000\ncluster-enabled yes\ncluster-config-file nodes-7000.conf\ncluster-node-timeout 5000\nappendonly yes\ndir /var/redis/7000\n\n# redis-node-2.conf (Master 2)\nport 7001\ncluster-enabled yes\ncluster-config-file nodes-7001.conf\ncluster-node-timeout 5000\nappendonly yes\ndir /var/redis/7001\n\n# redis-node-3.conf (Master 3)\nport 7002\ncluster-enabled yes\ncluster-config-file nodes-7002.conf\ncluster-node-timeout 5000\nappendonly yes\ndir /var/redis/7002\n\n# Replica configs (7003, 7004, 7005) similar\n# with different ports and directories

Creating a Cluster

Use redis-cli to create the cluster and assign slots to masters.

# Start all Redis instances\nredis-server /path/to/redis-node-1.conf\nredis-server /path/to/redis-node-2.conf\nredis-server /path/to/redis-node-3.conf\nredis-server /path/to/redis-node-4.conf\nredis-server /path/to/redis-node-5.conf\nredis-server /path/to/redis-node-6.conf\n\n# Create cluster (Redis 5+)\nredis-cli --cluster create \\n  127.0.0.1:7000 127.0.0.1:7001 127.0.0.1:7002 \\n  127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005 \\n  --cluster-replicas 1\n\n# Output:\n# Master[0] -> Slots 0-5460\n# Master[1] -> Slots 5461-10922\n# Master[2] -> Slots 10923-16383\n# Replica assignments...

Connecting to Cluster

Use cluster-aware Redis clients that handle redirections automatically.

const Redis = require('ioredis');\n\n// Cluster connection\nconst cluster = new Redis.Cluster([\n  { host: '127.0.0.1', port: 7000 },\n  { host: '127.0.0.1', port: 7001 },\n  { host: '127.0.0.1', port: 7002 }\n], {\n  redisOptions: {\n    password: 'your-password'\n  },\n  clusterRetryStrategy: (times) => {\n    return Math.min(100 * times, 2000);\n  }\n});\n\n// Operations work transparently\nawait cluster.set('user:1000', 'John');\nconst user = await cluster.get('user:1000');\n\n// Client handles redirections automatically\n// If key is on different node, client follows MOVED/ASK

Client Redirections: When a key is on a different node, Redis returns MOVED (permanent) or ASK (temporary) redirects. Cluster-aware clients handle these automatically.

Adding Nodes to Cluster

Scale your cluster by adding new master or replica nodes dynamically.

# Start new node\nredis-server /path/to/redis-node-7.conf\n\n# Add as master\nredis-cli --cluster add-node \\n  127.0.0.1:7006 127.0.0.1:7000\n\n# Add as replica of specific master\nredis-cli --cluster add-node \\n  127.0.0.1:7006 127.0.0.1:7000 \\n  --cluster-slave \\n  --cluster-master-id <master-node-id>\n\n# Rebalance slots to new master\nredis-cli --cluster rebalance 127.0.0.1:7000 \\n  --cluster-use-empty-masters

Removing Nodes

Remove nodes gracefully by redistributing their slots first.

# Get node ID\nredis-cli -p 7000 cluster nodes\n\n# Reshard slots away from node\nredis-cli --cluster reshard 127.0.0.1:7000 \\n  --cluster-from <source-node-id> \\n  --cluster-to <destination-node-id> \\n  --cluster-slots <number-of-slots>\n\n# Remove empty node\nredis-cli --cluster del-node 127.0.0.1:7000 <node-id>\n\n# For replica, just remove (no resharding needed)\nredis-cli --cluster del-node 127.0.0.1:7003 <replica-node-id>

Automatic Failover

When a master fails, Redis Cluster automatically promotes a replica to master.

// Failover process:\n// 1. Replicas detect master failure (node-timeout)\n// 2. Replica election (lowest replica rank wins)\n// 3. Winner promotes itself to master\n// 4. New master claims slots\n// 5. Cluster continues operating\n\n// Monitor cluster health\nredis-cli -p 7000 cluster info\n\n// Output:\ncluster_state:ok\ncluster_slots_assigned:16384\ncluster_slots_ok:16384\ncluster_slots_pfail:0\ncluster_slots_fail:0\ncluster_known_nodes:6\ncluster_size:3

Tip: Set cluster-node-timeout to appropriate value (5000ms typical). Too low causes false positives, too high delays failover.

Manual Failover

Trigger manual failover for maintenance or testing.

# Connect to replica and trigger failover\nredis-cli -p 7003 cluster failover\n\n# Options:\ncluster failover          # Wait for replication, then failover\ncluster failover force    # Failover immediately (data loss risk)\ncluster failover takeover # Become master without election\n\n// Application code remains unaffected\n// Cluster clients detect topology change and reconnect

Data Migration

Redis Cluster migrates keys between nodes during resharding.

# Migrate slots 1000-1500 from node A to node B\n\n# Step 1: Prepare receiving node\nCLUSTER SETSLOT <slot> IMPORTING <source-node-id>\n\n# Step 2: Prepare source node\nCLUSTER SETSLOT <slot> MIGRATING <destination-node-id>\n\n# Step 3: Migrate keys\nCLUSTER GETKEYSINSLOT <slot> 100  # Get keys\nMIGRATE <dest-host> <dest-port> <key> 0 5000  # Migrate each\n\n# Step 4: Assign slot to new node\nCLUSTER SETSLOT <slot> NODE <destination-node-id>\n\n# redis-cli --cluster reshard automates this process

Cluster vs Sentinel

Redis Cluster and Redis Sentinel solve different problems.

Redis Cluster:\n✓ Horizontal scaling (multiple masters)\n✓ Automatic sharding across nodes\n✓ No single point of failure\n✓ 16,384 hash slots distribution\n✓ Automatic failover\n✗ More complex setup\n✗ Some commands limited (multi-key ops)\n\nRedis Sentinel:\n✓ High availability for single master\n✓ Automatic failover\n✓ Monitoring and notifications\n✓ Simpler setup\n✗ No horizontal scaling (single master)\n✗ Manual sharding required\n✗ Single point of failure (master)\n\nUse Cluster when: Need to scale beyond single server\nUse Sentinel when: Single master sufficient, need HA

Decision Guide: Start with single Redis instance. Add Sentinel for high availability. Upgrade to Cluster when data/traffic exceeds single server capacity.

Cluster Limitations

Be aware of Redis Cluster constraints when designing your application.

// Multi-key operations require keys on same slot\n// ✗ Won't work:\nMGET user:1000 user:2000 user:3000  // Different slots\n\n// ✓ Works with hash tags:\nMGET user:{1000}:profile user:{1000}:orders\n\n// Database selection not supported\n// ✗ Won't work:\nSELECT 1  // Cluster uses database 0 only\n\n// Pub/Sub is global (all nodes)\n// Messages published to any node reach all subscribers\n\n// Transactions limited to same slot\n// ✓ Works:\nMULTI\nSET user:{1000}:name \"John\"\nINCR user:{1000}:visits\nEXEC\n\n// ✗ Won't work:\nMULTI\nSET user:1000 \"John\"  // Slot A\nSET user:2000 \"Jane\"  // Slot B - different node!\nEXEC

Monitoring Cluster Health

Track cluster status and performance metrics regularly.

// Check cluster status\nredis-cli --cluster check 127.0.0.1:7000\n\n// Output:\n// All 16384 slots covered\n// All nodes agree about slots configuration\n// All nodes are reachable\n\n// Get node information\nredis-cli -p 7000 cluster nodes\n\n// Monitor slot distribution\nredis-cli -p 7000 cluster slots\n\n// Key metrics to monitor:\n// - cluster_state: ok/fail\n// - cluster_slots_fail: Should be 0\n// - Memory usage per node\n// - Network latency between nodes\n// - Failover frequency\n// - Keys distribution balance

Warning: Redis Cluster requires at least 3 master nodes. With fewer nodes, cluster cannot form quorum for failover decisions.

Production Best Practices

Follow these guidelines for reliable Redis Cluster deployments.

1. Odd number of masters (3, 5, 7)\n   - Ensures clear majority for elections\n\n2. At least one replica per master\n   - Enables automatic failover\n\n3. Spread nodes across availability zones\n   - Protects against datacenter failures\n\n4. Use hash tags strategically\n   - Group related keys for multi-key operations\n\n5. Monitor cluster health continuously\n   - Alert on split brain or slot coverage issues\n\n6. Test failover regularly\n   - Ensure automatic recovery works\n\n7. Plan for resharding\n   - Add capacity before hitting limits\n\n8. Set appropriate timeouts\n   - cluster-node-timeout: 5000ms typical\n   - cluster-replica-validity-factor: 10 (default)\n\n9. Backup regularly\n   - RDB snapshots or AOF persistence\n\n10. Use cluster-aware client libraries\n    - ioredis (Node.js), redis-py-cluster (Python)

Exercise: Set up a local Redis Cluster with 3 masters and 3 replicas. Store user data using hash tags (user:{id}:profile, user:{id}:orders). Test automatic failover by killing a master node and verifying replica promotion. Add a new master node and rebalance slots. Monitor cluster status and slot distribution. Implement a Node.js application that connects to the cluster and handles multi-key operations correctly.