REST API Development

Microservices API Communication

15 min Lesson 27 of 35

Understanding Microservices Communication

In a microservices architecture, services need to communicate with each other to fulfill complex business requirements. Unlike monolithic applications where components communicate through direct function calls, microservices must communicate over the network. This introduces challenges around reliability, security, performance, and fault tolerance that require specialized patterns and solutions.

Communication Patterns Overview

Synchronous Communication

Services make direct HTTP requests and wait for responses. Common protocols include REST, GraphQL, and gRPC.

Advantages: Simple to understand, immediate responses, easier debugging. Disadvantages: Creates tight coupling, cascading failures, requires both services to be available simultaneously.

Asynchronous Communication

Services communicate through message queues or event buses without waiting for responses. Examples include RabbitMQ, Apache Kafka, and AWS SQS.

Advantages: Loose coupling, better fault tolerance, scalability. Disadvantages: More complex, eventual consistency, harder to debug.

Service-to-Service Authentication

Securing communication between microservices is critical. Several approaches exist, each with different tradeoffs.

1. API Keys (Shared Secrets)

Simple but effective for internal service communication. Each service has a unique API key.

// Service A calling Service B with API key const axios = require('axios'); async function fetchUserOrders(userId) { try { const response = await axios.get( `http://order-service:3001/orders?userId=${userId}`, { headers: { 'X-API-Key': process.env.ORDER_SERVICE_API_KEY, 'X-Service-Name': 'user-service', 'X-Request-ID': generateRequestId() } } ); return response.data; } catch (error) { console.error('Failed to fetch orders:', error.message); throw error; } } // In the receiving service (Order Service) const authenticateService = (req, res, next) => { const apiKey = req.headers['x-api-key']; const serviceName = req.headers['x-service-name']; // Validate API key against stored keys const validKeys = { 'user-service': process.env.USER_SERVICE_API_KEY, 'product-service': process.env.PRODUCT_SERVICE_API_KEY, 'notification-service': process.env.NOTIFICATION_SERVICE_API_KEY }; if (!apiKey || validKeys[serviceName] !== apiKey) { return res.status(401).json({ error: 'Unauthorized', message: 'Invalid service credentials' }); } req.callingService = serviceName; next(); }; app.use('/orders', authenticateService);
Security Consideration: API keys should be stored in environment variables or secret management systems (AWS Secrets Manager, HashiCorp Vault), never hardcoded. Rotate keys regularly and implement key versioning.

2. JWT Service Tokens

Services authenticate with each other using JWT tokens with service-specific claims.

// Service token generation const jwt = require('jsonwebtoken'); function generateServiceToken(serviceName) { const payload = { sub: serviceName, iss: 'api-gateway', aud: ['order-service', 'user-service', 'product-service'], iat: Math.floor(Date.now() / 1000), exp: Math.floor(Date.now() / 1000) + (5 * 60), // 5 minutes service: true, permissions: getServicePermissions(serviceName) }; return jwt.sign(payload, process.env.SERVICE_JWT_SECRET); } function getServicePermissions(serviceName) { const permissions = { 'user-service': ['read:orders', 'read:products'], 'order-service': ['read:users', 'read:products', 'write:notifications'], 'notification-service': ['read:users'] }; return permissions[serviceName] || []; } // Making authenticated service calls const axios = require('axios'); async function callOrderService(endpoint, data) { const token = generateServiceToken('user-service'); return await axios({ method: 'GET', url: `http://order-service:3001${endpoint}`, headers: { 'Authorization': `Bearer ${token}`, 'Content-Type': 'application/json' }, data }); } // Receiving service validation const verifyServiceToken = (req, res, next) => { const token = req.headers.authorization?.split(' ')[1]; if (!token) { return res.status(401).json({ error: 'No token provided' }); } try { const decoded = jwt.verify(token, process.env.SERVICE_JWT_SECRET); // Check if it's a service token if (!decoded.service) { return res.status(403).json({ error: 'Not a service token' }); } // Check if token audience includes this service if (!decoded.aud.includes('order-service')) { return res.status(403).json({ error: 'Token not valid for this service' }); } req.callingService = decoded.sub; req.servicePermissions = decoded.permissions; next(); } catch (error) { return res.status(403).json({ error: 'Invalid token' }); } };

3. Mutual TLS (mTLS)

Both client and server authenticate each other using SSL/TLS certificates. Most secure but complex to implement.

// Node.js mTLS client configuration const https = require('https'); const fs = require('fs'); const options = { hostname: 'order-service', port: 3001, path: '/orders', method: 'GET', // Client certificate cert: fs.readFileSync('./certs/user-service-cert.pem'), key: fs.readFileSync('./certs/user-service-key.pem'), // CA certificate to verify server ca: fs.readFileSync('./certs/ca-cert.pem'), // Reject unauthorized certificates rejectUnauthorized: true }; const req = https.request(options, (res) => { let data = ''; res.on('data', (chunk) => data += chunk); res.on('end', () => console.log('Response:', data)); }); req.on('error', (error) => console.error('mTLS error:', error)); req.end();
Best Practice: Use service mesh solutions like Istio or Linkerd to automate mTLS implementation across all services. They handle certificate rotation, mutual authentication, and encrypted communication transparently.

Circuit Breaker Pattern

Prevent cascading failures by failing fast when a service is unavailable. The circuit breaker monitors failures and temporarily blocks requests to failing services.

Circuit Breaker States

  • Closed: Normal operation, requests pass through
  • Open: Service is failing, requests fail immediately without calling the service
  • Half-Open: Testing if service recovered, limited requests allowed
// Circuit breaker implementation with opossum const CircuitBreaker = require('opossum'); const axios = require('axios'); // Function to call external service async function callProductService(productId) { const response = await axios.get( `http://product-service:3001/products/${productId}`, { timeout: 3000, headers: { 'X-API-Key': process.env.PRODUCT_SERVICE_API_KEY } } ); return response.data; } // Circuit breaker configuration const breakerOptions = { timeout: 3000, // Request timeout errorThresholdPercentage: 50, // Open circuit at 50% failure rate resetTimeout: 30000, // Try again after 30 seconds volumeThreshold: 10, // Minimum requests before tripping name: 'productServiceBreaker', fallback: (productId) => { console.log(`Circuit open, returning cached data for product ${productId}`); return getCachedProduct(productId); } }; const productBreaker = new CircuitBreaker(callProductService, breakerOptions); // Event handlers productBreaker.on('open', () => { console.error('Circuit breaker opened - product service is down'); // Send alert to monitoring system alerting.send('Product Service Circuit Breaker OPEN'); }); productBreaker.on('halfOpen', () => { console.log('Circuit breaker half-open - testing product service'); }); productBreaker.on('close', () => { console.log('Circuit breaker closed - product service recovered'); alerting.send('Product Service Circuit Breaker CLOSED'); }); productBreaker.on('fallback', (result) => { console.log('Fallback executed, returning:', result); }); // Usage in service async function getProductDetails(productId) { try { const product = await productBreaker.fire(productId); return product; } catch (error) { // Handle complete failure console.error('Product service completely unavailable:', error); throw new Error('Product information temporarily unavailable'); } } // Cache implementation for fallback const productCache = new Map(); async function getCachedProduct(productId) { const cached = productCache.get(productId); if (cached) { return { ...cached, cached: true }; } throw new Error('No cached data available'); }
Circuit Breaker Metrics: Monitor circuit breaker statistics (failure rate, open/close events, fallback executions) to identify service health issues and optimize timeout/threshold settings.

Retry Strategies

Automatically retry failed requests with intelligent backoff strategies to handle transient failures.

Exponential Backoff with Jitter

// Retry implementation with exponential backoff const axios = require('axios'); async function retryWithBackoff(fn, maxRetries = 3, initialDelay = 1000) { let lastError; for (let attempt = 0; attempt <= maxRetries; attempt++) { try { return await fn(); } catch (error) { lastError = error; // Don't retry on 4xx errors (client errors) if (error.response?.status >= 400 && error.response?.status < 500) { throw error; } // Last attempt, throw error if (attempt === maxRetries) { console.error(`All ${maxRetries + 1} attempts failed`); throw lastError; } // Calculate delay with exponential backoff and jitter const exponentialDelay = initialDelay * Math.pow(2, attempt); const jitter = Math.random() * 1000; // Random 0-1000ms const delay = exponentialDelay + jitter; console.log(`Attempt ${attempt + 1} failed, retrying in ${delay}ms...`); await sleep(delay); } } throw lastError; } function sleep(ms) { return new Promise(resolve => setTimeout(resolve, ms)); } // Usage async function fetchUserOrders(userId) { return await retryWithBackoff(async () => { const response = await axios.get( `http://order-service:3001/orders?userId=${userId}`, { timeout: 5000, headers: { 'X-API-Key': process.env.ORDER_SERVICE_API_KEY } } ); return response.data; }, 3, 1000); // Max 3 retries, starting with 1 second delay }
Jitter Importance: Adding random jitter prevents the "thundering herd" problem where multiple clients retry simultaneously, potentially overwhelming the recovering service.

Advanced Retry with axios-retry

const axios = require('axios'); const axiosRetry = require('axios-retry'); // Configure axios with retry logic axiosRetry(axios, { retries: 3, retryDelay: axiosRetry.exponentialDelay, // Built-in exponential backoff retryCondition: (error) => { // Retry on network errors or 5xx responses return axiosRetry.isNetworkOrIdempotentRequestError(error) || (error.response?.status >= 500 && error.response?.status < 600); }, onRetry: (retryCount, error, requestConfig) => { console.log(`Retry attempt ${retryCount} for ${requestConfig.url}`); } }); // Now all axios requests automatically retry async function getProduct(productId) { const response = await axios.get( `http://product-service:3001/products/${productId}`, { timeout: 3000, headers: { 'X-API-Key': process.env.PRODUCT_SERVICE_API_KEY } } ); return response.data; }

Timeout Management

Proper timeout configuration prevents indefinite waiting and cascading delays across services.

// Multi-level timeout strategy const axios = require('axios'); // Create axios instance with default timeouts const serviceClient = axios.create({ timeout: 5000, // Default 5 second timeout headers: { 'X-API-Key': process.env.API_KEY, 'X-Service-Name': 'user-service' } }); // Request timeout (total time including retries) const AbortController = require('abort-controller'); async function fetchWithTotalTimeout(url, totalTimeout = 10000) { const controller = new AbortController(); const timeoutId = setTimeout(() => controller.abort(), totalTimeout); try { const response = await serviceClient.get(url, { signal: controller.signal }); return response.data; } catch (error) { if (error.name === 'AbortError') { throw new Error(`Request timeout after ${totalTimeout}ms`); } throw error; } finally { clearTimeout(timeoutId); } } // Different timeouts for different operations async function getUserProfile(userId) { // Fast operation - 2 second timeout return await serviceClient.get(`/users/${userId}`, { timeout: 2000 }); } async function generateReport(userId) { // Slow operation - 30 second timeout return await serviceClient.post('/reports/generate', { userId }, { timeout: 30000 }); }
Timeout Best Practices: Set timeouts shorter than your load balancer/API gateway timeouts. A good rule: Service timeout < Gateway timeout < Client timeout. Monitor P95/P99 response times to set appropriate timeouts.

gRPC for High-Performance Communication

gRPC is a high-performance RPC framework using HTTP/2 and Protocol Buffers. Ideal for service-to-service communication.

Why gRPC?

  • Performance: Binary protocol, HTTP/2 multiplexing, smaller payloads
  • Type Safety: Strongly-typed contracts with Protocol Buffers
  • Bidirectional Streaming: Server and client streaming support
  • Code Generation: Auto-generate client/server code in multiple languages
// user.proto - Protocol Buffer definition syntax = "proto3"; package user; service UserService { rpc GetUser (GetUserRequest) returns (User); rpc ListUsers (ListUsersRequest) returns (stream User); rpc UpdateUser (UpdateUserRequest) returns (User); } message GetUserRequest { int32 id = 1; } message ListUsersRequest { int32 page = 1; int32 page_size = 2; } message UpdateUserRequest { int32 id = 1; string name = 2; string email = 3; } message User { int32 id = 1; string name = 2; string email = 3; string created_at = 4; }
// gRPC Server (Node.js) const grpc = require('@grpc/grpc-js'); const protoLoader = require('@grpc/proto-loader'); // Load proto file const packageDefinition = protoLoader.loadSync('user.proto', { keepCase: true, longs: String, enums: String, defaults: true, oneofs: true }); const userProto = grpc.loadPackageDefinition(packageDefinition).user; // Implement service methods const getUser = async (call, callback) => { const userId = call.request.id; try { const user = await db.users.findById(userId); callback(null, user); } catch (error) { callback({ code: grpc.status.NOT_FOUND, message: `User ${userId} not found` }); } }; const listUsers = (call) => { const { page, page_size } = call.request; // Stream users to client db.users.findAll({ page, page_size }).forEach(user => { call.write(user); }); call.end(); }; // Create and start server const server = new grpc.Server(); server.addService(userProto.UserService.service, { getUser, listUsers, updateUser }); server.bindAsync( '0.0.0.0:50051', grpc.ServerCredentials.createInsecure(), (err, port) => { if (err) { console.error('Server error:', err); return; } console.log(`gRPC server running on port ${port}`); server.start(); } );
// gRPC Client (Node.js) const grpc = require('@grpc/grpc-js'); const protoLoader = require('@grpc/proto-loader'); const packageDefinition = protoLoader.loadSync('user.proto'); const userProto = grpc.loadPackageDefinition(packageDefinition).user; // Create client const client = new userProto.UserService( 'user-service:50051', grpc.credentials.createInsecure() ); // Call getUser method client.getUser({ id: 123 }, (error, user) => { if (error) { console.error('gRPC error:', error); return; } console.log('User:', user); }); // Stream users const call = client.listUsers({ page: 1, page_size: 10 }); call.on('data', (user) => { console.log('Received user:', user); }); call.on('end', () => { console.log('Stream ended'); }); call.on('error', (error) => { console.error('Stream error:', error); });
gRPC Best Practices: Use gRPC for internal service-to-service communication where you control both client and server. Use REST for public APIs and third-party integrations. Consider gRPC-Web for browser clients.

Combining Patterns: Resilient Service Communication

// Production-ready service client combining all patterns const axios = require('axios'); const CircuitBreaker = require('opossum'); const axiosRetry = require('axios-retry'); class ResilientServiceClient { constructor(serviceName, baseURL, options = {}) { this.serviceName = serviceName; this.baseURL = baseURL; // Create axios instance this.client = axios.create({ baseURL, timeout: options.timeout || 5000, headers: { 'X-API-Key': process.env[`${serviceName.toUpperCase()}_API_KEY`], 'X-Service-Name': process.env.SERVICE_NAME } }); // Add retry logic axiosRetry(this.client, { retries: options.retries || 3, retryDelay: axiosRetry.exponentialDelay, retryCondition: axiosRetry.isNetworkOrIdempotentRequestError }); // Wrap in circuit breaker this.breaker = new CircuitBreaker( (config) => this.client.request(config), { timeout: options.breakerTimeout || 10000, errorThresholdPercentage: 50, resetTimeout: 30000, name: `${serviceName}Breaker` } ); this.setupEventHandlers(); } setupEventHandlers() { this.breaker.on('open', () => { console.error(`[${this.serviceName}] Circuit breaker OPEN`); }); this.breaker.on('close', () => { console.log(`[${this.serviceName}] Circuit breaker CLOSED`); }); } async get(url, config = {}) { return await this.breaker.fire({ method: 'GET', url, ...config }); } async post(url, data, config = {}) { return await this.breaker.fire({ method: 'POST', url, data, ...config }); } async put(url, data, config = {}) { return await this.breaker.fire({ method: 'PUT', url, data, ...config }); } async delete(url, config = {}) { return await this.breaker.fire({ method: 'DELETE', url, ...config }); } } // Usage const orderService = new ResilientServiceClient( 'order-service', 'http://order-service:3001' ); const productService = new ResilientServiceClient( 'product-service', 'http://product-service:3002' ); // Make resilient calls async function getUserOrders(userId) { try { const response = await orderService.get(`/orders?userId=${userId}`); return response.data; } catch (error) { console.error('Failed to fetch orders:', error.message); return []; } }
Exercise: Build a resilient microservices communication system with the following requirements:
  1. Create three services: User Service, Order Service, and Notification Service
  2. Implement JWT-based service-to-service authentication
  3. Add circuit breakers with 50% error threshold and 30-second reset timeout
  4. Implement exponential backoff retry (max 3 retries) with jitter
  5. Set appropriate timeouts: 3s for GET requests, 10s for POST/PUT requests
  6. When Order Service is called, it should:
    • Fetch user details from User Service
    • Create the order
    • Notify user via Notification Service (fire-and-forget, don't fail order creation if notification fails)
  7. Log all circuit breaker state changes and retry attempts

Implement the order creation endpoint with proper error handling and fallback mechanisms.