REST API Development

Microservices API Communication

15 min Lesson 27 of 35

Understanding Microservices Communication

In a microservices architecture, services need to communicate with each other to fulfill complex business requirements. Unlike monolithic applications where components communicate through direct function calls, microservices must communicate over the network. This introduces challenges around reliability, security, performance, and fault tolerance that require specialized patterns and solutions.

Communication Patterns Overview

Synchronous Communication

Services make direct HTTP requests and wait for responses. Common protocols include REST, GraphQL, and gRPC.

Advantages: Simple to understand, immediate responses, easier debugging. Disadvantages: Creates tight coupling, cascading failures, requires both services to be available simultaneously.

Asynchronous Communication

Services communicate through message queues or event buses without waiting for responses. Examples include RabbitMQ, Apache Kafka, and AWS SQS.

Advantages: Loose coupling, better fault tolerance, scalability. Disadvantages: More complex, eventual consistency, harder to debug.

Service-to-Service Authentication

Securing communication between microservices is critical. Several approaches exist, each with different tradeoffs.

1. API Keys (Shared Secrets)

Simple but effective for internal service communication. Each service has a unique API key.

// Service A calling Service B with API key
const axios = require('axios');

async function fetchUserOrders(userId) {
    try {
        const response = await axios.get(
            `http://order-service:3001/orders?userId=${userId}`,
            {
                headers: {
                    'X-API-Key': process.env.ORDER_SERVICE_API_KEY,
                    'X-Service-Name': 'user-service',
                    'X-Request-ID': generateRequestId()
                }
            }
        );
        return response.data;
    } catch (error) {
        console.error('Failed to fetch orders:', error.message);
        throw error;
    }
}

// In the receiving service (Order Service)
const authenticateService = (req, res, next) => {
    const apiKey = req.headers['x-api-key'];
    const serviceName = req.headers['x-service-name'];

    // Validate API key against stored keys
    const validKeys = {
        'user-service': process.env.USER_SERVICE_API_KEY,
        'product-service': process.env.PRODUCT_SERVICE_API_KEY,
        'notification-service': process.env.NOTIFICATION_SERVICE_API_KEY
    };

    if (!apiKey || validKeys[serviceName] !== apiKey) {
        return res.status(401).json({
            error: 'Unauthorized',
            message: 'Invalid service credentials'
        });
    }

    req.callingService = serviceName;
    next();
};

app.use('/orders', authenticateService);

Security Consideration: API keys should be stored in environment variables or secret management systems (AWS Secrets Manager, HashiCorp Vault), never hardcoded. Rotate keys regularly and implement key versioning.

2. JWT Service Tokens

Services authenticate with each other using JWT tokens with service-specific claims.

// Service token generation
const jwt = require('jsonwebtoken');

function generateServiceToken(serviceName) {
    const payload = {
        sub: serviceName,
        iss: 'api-gateway',
        aud: ['order-service', 'user-service', 'product-service'],
        iat: Math.floor(Date.now() / 1000),
        exp: Math.floor(Date.now() / 1000) + (5 * 60), // 5 minutes
        service: true,
        permissions: getServicePermissions(serviceName)
    };

    return jwt.sign(payload, process.env.SERVICE_JWT_SECRET);
}

function getServicePermissions(serviceName) {
    const permissions = {
        'user-service': ['read:orders', 'read:products'],
        'order-service': ['read:users', 'read:products', 'write:notifications'],
        'notification-service': ['read:users']
    };
    return permissions[serviceName] || [];
}

// Making authenticated service calls
const axios = require('axios');

async function callOrderService(endpoint, data) {
    const token = generateServiceToken('user-service');

    return await axios({
        method: 'GET',
        url: `http://order-service:3001${endpoint}`,
        headers: {
            'Authorization': `Bearer ${token}`,
            'Content-Type': 'application/json'
        },
        data
    });
}

// Receiving service validation
const verifyServiceToken = (req, res, next) => {
    const token = req.headers.authorization?.split(' ')[1];

    if (!token) {
        return res.status(401).json({ error: 'No token provided' });
    }

    try {
        const decoded = jwt.verify(token, process.env.SERVICE_JWT_SECRET);

        // Check if it's a service token
        if (!decoded.service) {
            return res.status(403).json({ error: 'Not a service token' });
        }

        // Check if token audience includes this service
        if (!decoded.aud.includes('order-service')) {
            return res.status(403).json({ error: 'Token not valid for this service' });
        }

        req.callingService = decoded.sub;
        req.servicePermissions = decoded.permissions;
        next();
    } catch (error) {
        return res.status(403).json({ error: 'Invalid token' });
    }
};

3. Mutual TLS (mTLS)

Both client and server authenticate each other using SSL/TLS certificates. Most secure but complex to implement.

// Node.js mTLS client configuration
const https = require('https');
const fs = require('fs');

const options = {
    hostname: 'order-service',
    port: 3001,
    path: '/orders',
    method: 'GET',
    // Client certificate
    cert: fs.readFileSync('./certs/user-service-cert.pem'),
    key: fs.readFileSync('./certs/user-service-key.pem'),
    // CA certificate to verify server
    ca: fs.readFileSync('./certs/ca-cert.pem'),
    // Reject unauthorized certificates
    rejectUnauthorized: true
};

const req = https.request(options, (res) => {
    let data = '';
    res.on('data', (chunk) => data += chunk);
    res.on('end', () => console.log('Response:', data));
});

req.on('error', (error) => console.error('mTLS error:', error));
req.end();

Best Practice: Use service mesh solutions like Istio or Linkerd to automate mTLS implementation across all services. They handle certificate rotation, mutual authentication, and encrypted communication transparently.

Circuit Breaker Pattern

Prevent cascading failures by failing fast when a service is unavailable. The circuit breaker monitors failures and temporarily blocks requests to failing services.

Circuit Breaker States

Closed: Normal operation, requests pass through
Open: Service is failing, requests fail immediately without calling the service
Half-Open: Testing if service recovered, limited requests allowed

// Circuit breaker implementation with opossum
const CircuitBreaker = require('opossum');
const axios = require('axios');

// Function to call external service
async function callProductService(productId) {
    const response = await axios.get(
        `http://product-service:3001/products/${productId}`,
        {
            timeout: 3000,
            headers: { 'X-API-Key': process.env.PRODUCT_SERVICE_API_KEY }
        }
    );
    return response.data;
}

// Circuit breaker configuration
const breakerOptions = {
    timeout: 3000, // Request timeout
    errorThresholdPercentage: 50, // Open circuit at 50% failure rate
    resetTimeout: 30000, // Try again after 30 seconds
    volumeThreshold: 10, // Minimum requests before tripping
    name: 'productServiceBreaker',
    fallback: (productId) => {
        console.log(`Circuit open, returning cached data for product ${productId}`);
        return getCachedProduct(productId);
    }
};

const productBreaker = new CircuitBreaker(callProductService, breakerOptions);

// Event handlers
productBreaker.on('open', () => {
    console.error('Circuit breaker opened - product service is down');
    // Send alert to monitoring system
    alerting.send('Product Service Circuit Breaker OPEN');
});

productBreaker.on('halfOpen', () => {
    console.log('Circuit breaker half-open - testing product service');
});

productBreaker.on('close', () => {
    console.log('Circuit breaker closed - product service recovered');
    alerting.send('Product Service Circuit Breaker CLOSED');
});

productBreaker.on('fallback', (result) => {
    console.log('Fallback executed, returning:', result);
});

// Usage in service
async function getProductDetails(productId) {
    try {
        const product = await productBreaker.fire(productId);
        return product;
    } catch (error) {
        // Handle complete failure
        console.error('Product service completely unavailable:', error);
        throw new Error('Product information temporarily unavailable');
    }
}

// Cache implementation for fallback
const productCache = new Map();

async function getCachedProduct(productId) {
    const cached = productCache.get(productId);
    if (cached) {
        return { ...cached, cached: true };
    }
    throw new Error('No cached data available');
}

Circuit Breaker Metrics: Monitor circuit breaker statistics (failure rate, open/close events, fallback executions) to identify service health issues and optimize timeout/threshold settings.

Retry Strategies

Automatically retry failed requests with intelligent backoff strategies to handle transient failures.

Exponential Backoff with Jitter

// Retry implementation with exponential backoff
const axios = require('axios');

async function retryWithBackoff(fn, maxRetries = 3, initialDelay = 1000) {
    let lastError;

    for (let attempt = 0; attempt <= maxRetries; attempt++) {
        try {
            return await fn();
        } catch (error) {
            lastError = error;

            // Don't retry on 4xx errors (client errors)
            if (error.response?.status >= 400 && error.response?.status < 500) {
                throw error;
            }

            // Last attempt, throw error
            if (attempt === maxRetries) {
                console.error(`All ${maxRetries + 1} attempts failed`);
                throw lastError;
            }

            // Calculate delay with exponential backoff and jitter
            const exponentialDelay = initialDelay * Math.pow(2, attempt);
            const jitter = Math.random() * 1000; // Random 0-1000ms
            const delay = exponentialDelay + jitter;

            console.log(`Attempt ${attempt + 1} failed, retrying in ${delay}ms...`);
            await sleep(delay);
        }
    }

    throw lastError;
}

function sleep(ms) {
    return new Promise(resolve => setTimeout(resolve, ms));
}

// Usage
async function fetchUserOrders(userId) {
    return await retryWithBackoff(async () => {
        const response = await axios.get(
            `http://order-service:3001/orders?userId=${userId}`,
            {
                timeout: 5000,
                headers: { 'X-API-Key': process.env.ORDER_SERVICE_API_KEY }
            }
        );
        return response.data;
    }, 3, 1000); // Max 3 retries, starting with 1 second delay
}

Jitter Importance: Adding random jitter prevents the "thundering herd" problem where multiple clients retry simultaneously, potentially overwhelming the recovering service.

Advanced Retry with axios-retry

const axios = require('axios');
const axiosRetry = require('axios-retry');

// Configure axios with retry logic
axiosRetry(axios, {
    retries: 3,
    retryDelay: axiosRetry.exponentialDelay, // Built-in exponential backoff
    retryCondition: (error) => {
        // Retry on network errors or 5xx responses
        return axiosRetry.isNetworkOrIdempotentRequestError(error)
            || (error.response?.status >= 500 && error.response?.status < 600);
    },
    onRetry: (retryCount, error, requestConfig) => {
        console.log(`Retry attempt ${retryCount} for ${requestConfig.url}`);
    }
});

// Now all axios requests automatically retry
async function getProduct(productId) {
    const response = await axios.get(
        `http://product-service:3001/products/${productId}`,
        {
            timeout: 3000,
            headers: { 'X-API-Key': process.env.PRODUCT_SERVICE_API_KEY }
        }
    );
    return response.data;
}

Timeout Management

Proper timeout configuration prevents indefinite waiting and cascading delays across services.

// Multi-level timeout strategy
const axios = require('axios');

// Create axios instance with default timeouts
const serviceClient = axios.create({
    timeout: 5000, // Default 5 second timeout
    headers: {
        'X-API-Key': process.env.API_KEY,
        'X-Service-Name': 'user-service'
    }
});

// Request timeout (total time including retries)
const AbortController = require('abort-controller');

async function fetchWithTotalTimeout(url, totalTimeout = 10000) {
    const controller = new AbortController();
    const timeoutId = setTimeout(() => controller.abort(), totalTimeout);

    try {
        const response = await serviceClient.get(url, {
            signal: controller.signal
        });
        return response.data;
    } catch (error) {
        if (error.name === 'AbortError') {
            throw new Error(`Request timeout after ${totalTimeout}ms`);
        }
        throw error;
    } finally {
        clearTimeout(timeoutId);
    }
}

// Different timeouts for different operations
async function getUserProfile(userId) {
    // Fast operation - 2 second timeout
    return await serviceClient.get(`/users/${userId}`, { timeout: 2000 });
}

async function generateReport(userId) {
    // Slow operation - 30 second timeout
    return await serviceClient.post('/reports/generate', { userId }, { timeout: 30000 });
}

Timeout Best Practices: Set timeouts shorter than your load balancer/API gateway timeouts. A good rule: Service timeout < Gateway timeout < Client timeout. Monitor P95/P99 response times to set appropriate timeouts.

gRPC for High-Performance Communication

gRPC is a high-performance RPC framework using HTTP/2 and Protocol Buffers. Ideal for service-to-service communication.

Why gRPC?

Performance: Binary protocol, HTTP/2 multiplexing, smaller payloads
Type Safety: Strongly-typed contracts with Protocol Buffers
Bidirectional Streaming: Server and client streaming support
Code Generation: Auto-generate client/server code in multiple languages

// user.proto - Protocol Buffer definition
syntax = "proto3";

package user;

service UserService {
    rpc GetUser (GetUserRequest) returns (User);
    rpc ListUsers (ListUsersRequest) returns (stream User);
    rpc UpdateUser (UpdateUserRequest) returns (User);
}

message GetUserRequest {
    int32 id = 1;
}

message ListUsersRequest {
    int32 page = 1;
    int32 page_size = 2;
}

message UpdateUserRequest {
    int32 id = 1;
    string name = 2;
    string email = 3;
}

message User {
    int32 id = 1;
    string name = 2;
    string email = 3;
    string created_at = 4;
}

// gRPC Server (Node.js)
const grpc = require('@grpc/grpc-js');
const protoLoader = require('@grpc/proto-loader');

// Load proto file
const packageDefinition = protoLoader.loadSync('user.proto', {
    keepCase: true,
    longs: String,
    enums: String,
    defaults: true,
    oneofs: true
});

const userProto = grpc.loadPackageDefinition(packageDefinition).user;

// Implement service methods
const getUser = async (call, callback) => {
    const userId = call.request.id;

    try {
        const user = await db.users.findById(userId);
        callback(null, user);
    } catch (error) {
        callback({
            code: grpc.status.NOT_FOUND,
            message: `User ${userId} not found`
        });
    }
};

const listUsers = (call) => {
    const { page, page_size } = call.request;

    // Stream users to client
    db.users.findAll({ page, page_size }).forEach(user => {
        call.write(user);
    });

    call.end();
};

// Create and start server
const server = new grpc.Server();

server.addService(userProto.UserService.service, {
    getUser,
    listUsers,
    updateUser
});

server.bindAsync(
    '0.0.0.0:50051',
    grpc.ServerCredentials.createInsecure(),
    (err, port) => {
        if (err) {
            console.error('Server error:', err);
            return;
        }
        console.log(`gRPC server running on port ${port}`);
        server.start();
    }
);

// gRPC Client (Node.js)
const grpc = require('@grpc/grpc-js');
const protoLoader = require('@grpc/proto-loader');

const packageDefinition = protoLoader.loadSync('user.proto');
const userProto = grpc.loadPackageDefinition(packageDefinition).user;

// Create client
const client = new userProto.UserService(
    'user-service:50051',
    grpc.credentials.createInsecure()
);

// Call getUser method
client.getUser({ id: 123 }, (error, user) => {
    if (error) {
        console.error('gRPC error:', error);
        return;
    }
    console.log('User:', user);
});

// Stream users
const call = client.listUsers({ page: 1, page_size: 10 });

call.on('data', (user) => {
    console.log('Received user:', user);
});

call.on('end', () => {
    console.log('Stream ended');
});

call.on('error', (error) => {
    console.error('Stream error:', error);
});

gRPC Best Practices: Use gRPC for internal service-to-service communication where you control both client and server. Use REST for public APIs and third-party integrations. Consider gRPC-Web for browser clients.

Combining Patterns: Resilient Service Communication

// Production-ready service client combining all patterns
const axios = require('axios');
const CircuitBreaker = require('opossum');
const axiosRetry = require('axios-retry');

class ResilientServiceClient {
    constructor(serviceName, baseURL, options = {}) {
        this.serviceName = serviceName;
        this.baseURL = baseURL;

        // Create axios instance
        this.client = axios.create({
            baseURL,
            timeout: options.timeout || 5000,
            headers: {
                'X-API-Key': process.env[`${serviceName.toUpperCase()}_API_KEY`],
                'X-Service-Name': process.env.SERVICE_NAME
            }
        });

        // Add retry logic
        axiosRetry(this.client, {
            retries: options.retries || 3,
            retryDelay: axiosRetry.exponentialDelay,
            retryCondition: axiosRetry.isNetworkOrIdempotentRequestError
        });

        // Wrap in circuit breaker
        this.breaker = new CircuitBreaker(
            (config) => this.client.request(config),
            {
                timeout: options.breakerTimeout || 10000,
                errorThresholdPercentage: 50,
                resetTimeout: 30000,
                name: `${serviceName}Breaker`
            }
        );

        this.setupEventHandlers();
    }

    setupEventHandlers() {
        this.breaker.on('open', () => {
            console.error(`[${this.serviceName}] Circuit breaker OPEN`);
        });

        this.breaker.on('close', () => {
            console.log(`[${this.serviceName}] Circuit breaker CLOSED`);
        });
    }

    async get(url, config = {}) {
        return await this.breaker.fire({ method: 'GET', url, ...config });
    }

    async post(url, data, config = {}) {
        return await this.breaker.fire({ method: 'POST', url, data, ...config });
    }

    async put(url, data, config = {}) {
        return await this.breaker.fire({ method: 'PUT', url, data, ...config });
    }

    async delete(url, config = {}) {
        return await this.breaker.fire({ method: 'DELETE', url, ...config });
    }
}

// Usage
const orderService = new ResilientServiceClient(
    'order-service',
    'http://order-service:3001'
);

const productService = new ResilientServiceClient(
    'product-service',
    'http://product-service:3002'
);

// Make resilient calls
async function getUserOrders(userId) {
    try {
        const response = await orderService.get(`/orders?userId=${userId}`);
        return response.data;
    } catch (error) {
        console.error('Failed to fetch orders:', error.message);
        return [];
    }
}

Exercise: Build a resilient microservices communication system with the following requirements:

Create three services: User Service, Order Service, and Notification Service
Implement JWT-based service-to-service authentication
Add circuit breakers with 50% error threshold and 30-second reset timeout
Implement exponential backoff retry (max 3 retries) with jitter
Set appropriate timeouts: 3s for GET requests, 10s for POST/PUT requests
When Order Service is called, it should:
- Fetch user details from User Service
- Create the order
- Notify user via Notification Service (fire-and-forget, don't fail order creation if notification fails)
Log all circuit breaker state changes and retry attempts

Implement the order creation endpoint with proper error handling and fallback mechanisms.