JVM Internals & Performance

GC Algorithms & Tuning

15 min Lesson 4 of 13

GC Algorithms & Tuning

Understanding which garbage collector is running and why it behaves the way it does is the difference between guessing at JVM flags and making deliberate, measurable decisions. This lesson focuses on two collectors you will encounter in modern production services — G1GC and ZGC — and the baseline heap-sizing flags that govern their behaviour.

Why Different GC Algorithms Exist

Every collector is an engineering trade-off across three axes:

Throughput — total application work done per unit of time (GC pauses subtract from this).
Latency — the worst-case pause a single user request can experience.
Footprint — how much memory the collector itself consumes for its data structures.

No collector wins on all three simultaneously. Choosing the right one starts with knowing your SLO: is a 99th-percentile latency target more important than raw throughput?

Garbage-First (G1GC)

G1 has been the default collector since JDK 9. Its defining idea is that it divides the entire heap into a large number of equal-sized regions (typically 1 MB–32 MB each, chosen automatically). Regions are not permanently assigned to young or old generations; the collector dynamically reclassifies them.

How G1 Works at a High Level

Young-only phase — G1 runs concurrent marking while the application runs. Short stop-the-world (STW) minor GC pauses evacuate live objects from young regions into survivor or old regions.
Mixed collection phase — after marking identifies which old regions are mostly garbage, G1 includes a subset of those old regions in subsequent collections ("mixed GC"). It collects the regions with the most garbage first — hence the name Garbage-First.
Full GC — a last-resort single-threaded (pre-JDK 10) or parallel (JDK 10+) full compacting collection. You want to avoid this in production.

G1's pause-time goal: G1 tries to keep pauses under a user-specified target (-XX:MaxGCPauseMillis, default 200 ms) by adjusting how many regions it collects per cycle. It does not guarantee the target — it is a hint, not a hard deadline.

Key G1 Flags

# Use G1 explicitly (default on JDK 9+, but being explicit is self-documenting)
-XX:+UseG1GC

# Soft pause-time goal in milliseconds (default: 200)
-XX:MaxGCPauseMillis=100

# Number of GC threads for parallel work (default: auto-scaled to CPU count)
-XX:ParallelGCThreads=8

# Concurrent GC threads (marking, refinement — runs alongside the app)
-XX:ConcGCThreads=2

# Initiate concurrent marking when the heap is X% full (default: 45)
# Lower this if you see frequent Full GCs — gives G1 more time to finish marking
-XX:InitiatingHeapOccupancyPercent=35

# Region size (power of 2, 1m–32m); usually let the JVM choose
-XX:G1HeapRegionSize=4m

A real production command line might look like:

java -Xms4g -Xmx4g \
     -XX:+UseG1GC \
     -XX:MaxGCPauseMillis=150 \
     -XX:InitiatingHeapOccupancyPercent=40 \
     -Xlog:gc*:file=/var/log/app/gc.log:time,uptime,level,tags:filecount=5,filesize=20m \
     -jar myapp.jar

ZGC — Ultra-Low-Latency Garbage Collection

ZGC (production-ready since JDK 15, generational ZGC added in JDK 21) is designed for one primary goal: sub-millisecond STW pauses regardless of heap size. It achieves this through two advanced techniques:

Colored pointers — ZGC encodes GC metadata (mark bits, relocation state) directly inside the 64-bit object reference. This lets it do load-barrier work at the point where the reference is used rather than scanning the whole heap while paused.
Concurrent relocation — unlike G1 (which must move objects STW), ZGC relocates live objects while the application is running using a self-healing load barrier that transparently redirects stale references.

What "generational ZGC" means (JDK 21+): The original ZGC was non-generational — it treated all objects identically. Generational ZGC adds a young/old split, dramatically improving throughput by exploiting the weak generational hypothesis (most objects die young). Enable it with -XX:+ZGenerational (or use -XX:+UseZGC -XX:+ZGenerational on JDK 21).

Key ZGC Flags

# Enable ZGC
-XX:+UseZGC

# On JDK 21+ enable the generational mode (strongly recommended)
-XX:+ZGenerational

# ZGC also respects MaxGCPauseMillis but its actual pauses are already <1 ms
-XX:MaxGCPauseMillis=10

# Concurrent GC threads (ZGC does almost everything concurrently)
-XX:ConcGCThreads=4

G1 vs ZGC: When to Choose Which

Concern	G1GC	ZGC
Pause target	Tens to hundreds of ms	Sub-millisecond
Throughput overhead	Low (~5–10% vs. Parallel GC)	Slightly higher (load barriers)
Heap size sweet spot	4 GB – 32 GB	Any, including multi-terabyte
Memory footprint	Lower	Slightly higher (colored pointer metadata)
Minimum JDK	JDK 9	JDK 15 (production); JDK 21 for generational
Typical use case	Web services, batch, microservices	Real-time trading, gaming, latency-critical APIs

Start with G1. Its defaults are well-tuned for most services. Switch to ZGC only when profiling evidence shows GC pauses are actually hurting your latency SLO — not based on speculation. Premature GC tuning wastes engineering time and introduces risk.

Heap-Sizing Flags: The Foundation of GC Tuning

Before touching any collector-specific flag, set the heap size correctly. These three flags apply to every collector:

# Minimum heap size — JVM starts with at least this much
-Xms2g

# Maximum heap size — JVM never exceeds this
-Xmx2g

# Recommended: set Xms == Xmx in production
# Prevents the JVM from requesting/releasing memory from the OS,
# which avoids unpredictable OS-level pauses and wasted startup time.

# Maximum size of the Metaspace (class metadata, not counted in Xmx)
-XX:MaxMetaspaceSize=256m

Never set -Xmx higher than the available physical RAM minus OS and other process overhead. If the JVM must swap to disk, every GC cycle becomes catastrophically slow. A common rule: leave at least 1–2 GB for the OS and other JVM overhead (threads, JIT buffers, Metaspace, direct buffers) when sizing -Xmx.

Enabling GC Logging (Essential for Tuning)

You cannot tune what you cannot measure. Always enable GC logging in production:

# JDK 9+ unified logging syntax
-Xlog:gc*:file=/var/log/app/gc.log:time,uptime,level,tags:filecount=10,filesize=20m

This writes rolling GC logs with timestamps. Feed them into GCEasy (gceasy.io) or GCViewer for visual analysis — you will immediately see pause time distributions, allocation rates, and whether you are triggering Full GCs.

A Practical Starting Configuration

// Example: Spring Boot service, 8 GB container, latency-sensitive
// Run with:
//   java [flags below] -jar service.jar

-Xms6g
-Xmx6g
-XX:MaxMetaspaceSize=256m
-XX:+UseG1GC
-XX:MaxGCPauseMillis=100
-XX:InitiatingHeapOccupancyPercent=40
-Xlog:gc*:file=/var/log/service/gc.log:time,uptime,level,tags:filecount=10,filesize=20m

Measure pause times from the GC log after running under realistic load before changing anything else. Tune one flag at a time and re-measure — GC tuning is an empirical discipline, not a checklist.

Summary

G1GC: region-based, pause-time-goal-driven, default since JDK 9, excellent for 4–32 GB heaps.
ZGC: concurrent relocation with colored pointers, sub-millisecond pauses, generational mode in JDK 21 greatly improves throughput.
Heap sizing: set -Xms == -Xmx in production, leave headroom for the OS, cap Metaspace.
Always log GC output and analyse it before tuning flags — measure first, change second.