Microservices Architecture & Design

Distributed Data & Consistency

18 min Lesson 7 of 12

Distributed Data & Consistency

In a monolith every write lands in one database and you get ACID guarantees for free. In a microservices system each service owns its data (Lesson 4), so a single business operation — "place an order" — may need to write to three databases owned by three separate services. There is no distributed transaction manager to wrap them. This lesson explains the consistency model you must design for, and introduces the Saga pattern — the most practical tool for coordinating multi-service writes reliably.

Why Distributed Transactions Do Not Scale

The classic answer to cross-database writes is the Two-Phase Commit (2PC) protocol: a coordinator asks all participants to prepare, waits for their votes, then issues a global commit or rollback. It works, but it has serious costs in a microservices context:

  • Blocking locks: between the prepare and commit phases every participant holds database locks. Under high concurrency this becomes a throughput bottleneck.
  • Coordinator is a SPOF: if the coordinator crashes after prepare but before commit, participants are stuck in an uncertain state indefinitely.
  • Tight coupling: all participants must be reachable at the same moment — contrary to the independent-deployability goal of microservices.
  • Protocol support: many modern datastores (NoSQL, managed cloud DBs) do not implement XA / JTA at all.
CAP Theorem in one sentence: A distributed system can guarantee at most two of Consistency, Availability, and Partition-tolerance. In a real network, partition-tolerance is non-negotiable — so you choose between strong consistency (CP) and high availability (AP). Microservices systems almost always choose AP, which means you must design for eventual consistency.

Eventual Consistency

Eventual consistency is a liveness guarantee: if no new writes arrive, all replicas and services will eventually converge to the same value. It does not mean the system is always wrong or chaotic — it means there is a brief window (milliseconds to seconds in practice) during which different services may see different values.

Designing for eventual consistency requires you to think about three things:

  1. What is the worst-case stale window? (seconds, minutes?) and is that acceptable for this operation?
  2. What happens if a step fails partway through a multi-service write? How do you compensate?
  3. How does the UI handle a state that is temporarily inconsistent? (e.g. show "order pending" rather than a final status)

The Saga Pattern

A Saga is a sequence of local transactions, one per participating service. Each local transaction updates its own database and then publishes an event (or sends a command) to trigger the next step. If any step fails, the saga executes compensating transactions in reverse order to undo the already-completed steps.

Think of it as replacing one big ACID transaction with a chain of small ones, plus a defined rollback path for every step.

Two Saga Coordination Styles

There are two ways to organise the flow:

  • Choreography: each service listens for events and reacts autonomously. No central coordinator. Scales well but can be hard to visualise; you have to trace logs across services to understand what happened.
  • Orchestration: a dedicated saga orchestrator (a Spring Boot service or a Spring State Machine) explicitly tells each participant what to do next. Easier to reason about and monitor; the orchestrator is a modest coupling point but not a SPOF because it is stateless or stores state in its own DB.
Start with orchestration when your team is new to sagas. The explicit flow in one place makes debugging, testing, and onboarding significantly easier. Switch to choreography for very high-throughput pipelines where the orchestrator itself would be the bottleneck.

Order-Placement Saga — Choreography Example

Scenario: placing an order requires (1) reserving inventory, (2) charging payment, and (3) creating a shipment record. Each service publishes Spring application events (or Kafka topics in production).

// Order Service — publishes the first event @Service @RequiredArgsConstructor public class OrderService { private final OrderRepository orders; private final ApplicationEventPublisher events; @Transactional public Order place(PlaceOrderCommand cmd) { Order order = Order.create(cmd); orders.save(order); // publish AFTER the local DB write commits (TransactionalEventListener) events.publishEvent(new OrderPlacedEvent(order.getId(), cmd.items())); return order; } }
// Inventory Service — listens for OrderPlacedEvent @Service @RequiredArgsConstructor public class InventoryService { private final StockRepository stock; private final ApplicationEventPublisher events; @TransactionalEventListener(phase = TransactionPhase.AFTER_COMMIT) @Transactional(propagation = Propagation.REQUIRES_NEW) public void onOrderPlaced(OrderPlacedEvent evt) { boolean reserved = stock.reserve(evt.orderId(), evt.items()); if (reserved) { events.publishEvent(new InventoryReservedEvent(evt.orderId())); } else { // compensate — tell Order Service to cancel events.publishEvent(new InventoryReservationFailedEvent(evt.orderId())); } } }
// Order Service — compensating transaction @TransactionalEventListener(phase = TransactionPhase.AFTER_COMMIT) @Transactional(propagation = Propagation.REQUIRES_NEW) public void onInventoryFailed(InventoryReservationFailedEvent evt) { orders.findById(evt.orderId()).ifPresent(order -> { order.cancel("Inventory unavailable"); orders.save(order); }); }
Use @TransactionalEventListener(phase = AFTER_COMMIT), not @EventListener, when publishing saga events. If you publish inside the same transaction and that transaction later rolls back, the event fires but the DB write did not — leaving downstream services acting on phantom data. AFTER_COMMIT guarantees the local write is durable before the event leaves the process.

Idempotency — The Hidden Requirement

In a distributed system, messages can be delivered more than once (network retries, broker redelivery). Every saga step handler must be idempotent: processing the same event twice must produce the same outcome as processing it once.

// Idempotency guard using a processed-events table @Transactional public void onInventoryReserved(InventoryReservedEvent evt) { if (processedEvents.exists(evt.eventId())) { return; // already handled — skip } // ... do the work ... processedEvents.markProcessed(evt.eventId()); }

Outbox Pattern — Guaranteed Event Delivery

If a service writes to its DB and then publishes to a message broker in two separate operations, a crash between the two leaves the DB updated but the event never sent. The Transactional Outbox pattern solves this: write the event to an outbox table in the same local transaction as the business write, then have a separate relay process (e.g. Debezium CDC, or a Spring @Scheduled poller) forward outbox rows to the broker and delete them.

@Transactional public Order place(PlaceOrderCommand cmd) { Order order = Order.create(cmd); orders.save(order); // write event atomically with the order — same local transaction OutboxEvent outboxEvent = OutboxEvent.of( "OrderPlaced", objectMapper.writeValueAsString(new OrderPlacedPayload(order.getId())) ); outboxRepository.save(outboxEvent); return order; // relay picks up outbox rows and publishes to Kafka }

Consistency Levels in Practice

  • Read-your-writes: route a user's reads to the service that just processed their write (sticky routing or version tokens) so they never see stale state for their own actions.
  • Monotonic reads: once a client sees a value, subsequent reads must not return an older value. Achieved by version numbers or timestamps in responses.
  • Causal consistency: if event A caused event B, any service that sees B must already have seen A. The Saga pattern and event ordering in a Kafka partition provide this within a single order flow.

Summary

Distributed data management trades ACID guarantees for availability and independent deployability. Eventual consistency is the norm; your job is to make the eventual window short and the failure paths explicit. The Saga pattern — whether choreographed or orchestrated — replaces a distributed transaction with a chain of local transactions plus compensating actions. Layer in idempotency guards and the Outbox pattern to make event delivery reliable, and you have the foundations for a data layer that survives partial failures gracefully.