Metrics & Monitoring
Metrics & Monitoring
Distributed tracing tells you where time was spent in a single request. Metrics tell you how healthy your service is right now — and over time. In production, you cannot attach a debugger or read every log line; you need dashboards that summarise thousands of requests per second into numbers you can act on. This lesson covers Micrometer, the metrics facade built into Spring Boot 3, and how it feeds data into Prometheus (the collection backend) and Grafana (the visualisation layer).
The Metrics Stack in One Sentence
Your Spring Boot service exposes a /actuator/prometheus endpoint; a Prometheus server scrapes that endpoint on a schedule; Grafana queries Prometheus and renders dashboards. Each component has exactly one job, and the decoupling means you can swap any layer independently.
counter.increment() once; Micrometer translates it to Prometheus format, CloudWatch, Datadog, InfluxDB, or any other registry you add to the classpath. Your application code never imports a Prometheus class directly.
Adding the Dependencies
Spring Boot Actuator ships Micrometer core. To export Prometheus-format metrics add one more dependency:
Then expose the endpoint in application.yml:
The application tag propagates your service name to every metric automatically, which is essential when multiple services share one Prometheus instance.
/actuator/prometheus to the public internet. The endpoint reveals internal counters, thread pool sizes, database pool stats, and JVM memory breakdown — a gold mine for an attacker doing reconnaissance. Protect it with Spring Security, a dedicated management port (management.server.port: 9090) reachable only inside your cluster, or a network-level firewall rule. Prometheus should scrape from an internal network, not through your public load balancer.
The Four Core Metric Types
Micrometer provides four fundamental instruments. Choosing the right one matters because Prometheus aggregates them differently:
- Counter — a value that only goes up. Use it for events: orders placed, errors thrown, cache misses. Never reset it; Prometheus computes rates with
rate(). - Gauge — a value that goes up and down. Use it for current state: active connections, queue depth, memory used. Sampled at scrape time, not accumulated.
- Timer — measures duration and counts invocations simultaneously. Produces a histogram of latency percentiles. The most important instrument for latency SLOs.
- DistributionSummary — like Timer but for non-time values: bytes transferred, items in a batch, request payload size.
Recording Metrics in Your Service
Inject MeterRegistry (auto-configured by Spring Boot) into any Spring bean:
The publishPercentiles option computes client-side percentiles (stored in the app memory). For more accurate aggregation across multiple instances use publishPercentileHistogram(true) instead and compute percentiles in Prometheus with histogram_quantile().
The @Timed Annotation — Declarative Timing
For HTTP handler methods, annotating with @Timed is cleaner than wrapping every method body:
Spring Boot auto-configures a TimedAspect bean when Micrometer and Spring AOP are on the classpath. The resulting metric name follows the convention http.orders.create_seconds in Prometheus format.
http.server.requests timer records every inbound request with tags for method, uri, status, and outcome. Before adding custom metrics, check whether the built-in ones already answer your question.
Gauges for Live State
Gauges are best for values that fluctuate. A common pattern is tracking a queue's current size:
Notice the gauge takes a reference to the queue and a function to read its size. Micrometer calls the function at scrape time — no manual update needed.
How Prometheus Scrapes the Endpoint
The /actuator/prometheus response is plain text in the OpenMetrics exposition format. Prometheus configuration to scrape it:
In Kubernetes you would use service discovery annotations instead of static_configs, but the principle is the same.
Useful PromQL Queries
Once Prometheus is collecting data, these queries form the backbone of most dashboards:
rate(orders_created_total[1m])— orders per second over the last minute.histogram_quantile(0.95, rate(orders_processing_time_seconds_bucket[5m]))— 95th-percentile latency over 5 minutes.rate(http_server_requests_seconds_count{status="5xx"}[1m])— 5xx error rate per second.jvm_memory_used_bytes{area="heap"} / jvm_memory_max_bytes{area="heap"}— heap utilisation ratio.
The Four Golden Signals in Grafana
Site Reliability Engineering defines four golden signals that every service dashboard should show: Latency (p50/p95/p99 of request duration), Traffic (requests per second), Errors (error rate as a percentage of total traffic), and Saturation (CPU, heap, thread-pool fullness). Grafana lets you compose PromQL expressions into panels and set alert thresholds. A common starting point is importing the official Spring Boot Statistics dashboard (Grafana ID 6756) which covers JVM, HikariCP pool, and HTTP server metrics out of the box.
Summary
Micrometer provides a thin, vendor-neutral API over all metric types. Adding micrometer-registry-prometheus exposes a scrape endpoint; Prometheus pulls it on a schedule and Grafana visualises the data. Record counters for events, timers for latency, and gauges for live state. Secure the actuator endpoint — never expose it publicly. Build dashboards around the four golden signals. Combined with the distributed tracing from the previous lesson, you now have the full observability stack a production microservice requires.