Service Discovery, Config & Gateway

The Need for Service Discovery

18 min Lesson 1 of 12

The Need for Service Discovery

When you split a monolith into microservices you gain independent deployability, independent scaling, and team autonomy. You also gain a new class of problem that the monolith never had: how does Service A find Service B at runtime? In a monolith the answer is a local method call. In a distributed system the answer must account for the fact that every service can be restarted, scaled, moved to a different host, or replaced at any moment — often without any human intervention.

The Hard-Coded Address Trap

The most natural first instinct is to put a URL directly in a configuration file:

# application.yml — naive approach inventory: service: url: http://192.168.1.42:8081

Then, in the calling service:

@Service public class OrderService { private final RestClient restClient; // URL injected from application.yml @Value("${inventory.service.url}") private String inventoryUrl; public OrderService(RestClient.Builder builder) { this.restClient = builder.build(); } public StockResponse checkStock(String sku) { return restClient.get() .uri(inventoryUrl + "/api/stock/{sku}", sku) .retrieve() .body(StockResponse.class); } }

This works perfectly in a lab. It falls apart the moment you operate in anything resembling a real environment.

Why Hard-Coded Addresses Break in the Cloud

Cloud and container platforms are designed around ephemerality: instances come and go, IP addresses are assigned dynamically, and no single host is considered permanent. Hard-coded addresses violate every one of these assumptions.

Dynamic IP Assignment

Every time a Docker container or Kubernetes pod restarts, the platform assigns it a new IP address from a pool. The address 192.168.1.42 that was valid this morning may belong to a completely different service — or no service at all — by afternoon. Your hard-coded URL now points at nothing, or worse, at the wrong thing.

Horizontal Scaling

Suppose traffic spikes and your platform auto-scales the Inventory service from one instance to three. You now have three valid addresses — :8081, :8082, :8083 — but your configuration file still mentions only the original one. The extra capacity is completely invisible to your callers. You pay for three instances and use one.

# You have three instances running: # http://10.0.0.11:8081 (original — hard-coded) # http://10.0.0.12:8081 (new — invisible to Order service) # http://10.0.0.13:8081 (new — invisible to Order service) # # 100 % of traffic still goes to 10.0.0.11. # The other two instances receive zero requests.

Rolling Deployments and Zero-Downtime Updates

During a rolling deployment, old and new instances coexist temporarily. If a load balancer or the caller itself tracks only one address, requests pile up against a single instance during the transition. Other instances — some running the old version, some the new — are again invisible.

Multi-Environment Configuration Drift

With hard-coded addresses, every environment (dev, staging, production) needs its own version of every property file. Developers frequently forget to update one environment, deploying a staging build that still calls a production address — or vice versa. The blast radius of that mistake is wide.

Security implication of hard-coded addresses: Embedding a service's internal address in a configuration file that gets committed to source control exposes your internal network topology. An attacker who reads your repository now knows the IP, port, and path structure of every internal service. Use environment variables or a secrets manager for any value that changes between environments, and use a service registry to avoid storing addresses at all.

What Service Discovery Solves

Service discovery replaces hard-coded addresses with a registry — a shared, always-current directory of running service instances. Instead of asking "what is the fixed address of Inventory?", a caller asks "give me the address of any healthy Inventory instance right now."

The core contract has two sides:

  • Registration: when a service starts, it publishes its own address, port, and health-check URL to the registry. When it shuts down cleanly, it de-registers. When it crashes, the registry evicts it after a configurable timeout.
  • Discovery (lookup): when a caller needs to reach a service, it queries the registry by logical name (e.g., inventory-service) and receives back a list of healthy addresses. It can then pick one — round-robin, random, or by latency — without knowing anything about the underlying infrastructure.
The key shift in thinking: you stop reasoning about where a service lives (an IP address, a hostname) and start reasoning about what it is (a logical name). The registry owns the where; your code owns the what.

Client-Side vs. Server-Side Discovery

There are two broad patterns for using a registry, and it matters which one your framework implements:

  • Server-side discovery: the caller sends a request to a load balancer or gateway, which queries the registry and forwards the request. The caller does not know about the registry at all. AWS Application Load Balancer and Kubernetes Services work this way.
  • Client-side discovery: the caller itself queries the registry, selects an instance, and makes the HTTP call directly. Spring Cloud LoadBalancer (the replacement for Ribbon) implements this pattern. It is more flexible but puts registry awareness inside every service.

Spring Cloud Gateway (covered later in this tutorial) often blends both: the gateway acts as a server-side entry point for external traffic but uses client-side discovery internally to route to backend services.

The Distributed Systems Trade-Off

A service registry is itself a distributed component, which means it must be highly available. If the registry goes down, services that rely on it for routing decisions can no longer discover new peers. Spring Cloud Eureka addresses this with a local cache: each client caches the last-known registry snapshot and continues routing from that cache even when the registry server is temporarily unreachable. This is a deliberate trade-off between consistency (always seeing the latest list) and availability (being able to route at all) — exactly the AP corner of the CAP theorem.

Design for stale registry data: because clients use a cached registry, a freshly-crashed instance may remain in a caller's local list for several seconds or even minutes before the registry evicts it. Always pair service discovery with a circuit breaker (covered in the Resilience tutorial) so that repeated failures to a stale address do not cascade into broader outages. Discovery solves the find it problem; resilience patterns solve the what if it is down problem.

A Concrete Before-and-After

Without service discovery — what a caller's code looks like:

// Before: hard-coded, fragile String url = "http://192.168.1.42:8081/api/stock/" + sku; ResponseEntity<StockResponse> response = restTemplate.getForEntity(url, StockResponse.class);

With service discovery and Spring Cloud LoadBalancer — what the same call looks like:

// After: logical name resolved at runtime via the registry @Bean @LoadBalanced // tells Spring to intercept and resolve the hostname public RestTemplate restTemplate() { return new RestTemplate(); } // In the service: String url = "http://inventory-service/api/stock/" + sku; ResponseEntity<StockResponse> response = restTemplate.getForEntity(url, StockResponse.class); // "inventory-service" is looked up in Eureka; a healthy instance address is substituted

The calling code never knows the real IP. It never changes when instances are added, removed, or replaced. The registry and the load-balancing layer handle all of that transparently.

Summary

Hard-coded service addresses are a deceptively simple solution that breaks as soon as your infrastructure becomes dynamic — which is the default in any cloud or container environment. The core problems are dynamic IP assignment, invisible horizontal scaling, rolling-deployment complexity, and multi-environment drift. Service discovery solves these by introducing a registry that maps logical service names to live instance addresses, shifts infrastructure knowledge out of application code, and enables the load-balancing and resilience patterns the following lessons build on. In the next lesson you will implement this registry using Spring Cloud Netflix Eureka.