JVM Internals & Performance

Memory Leaks in Java

15 min Lesson 7 of 13

Memory Leaks in Java

Java has a garbage collector, so many developers assume memory leaks are impossible. That assumption is wrong and dangerous. A memory leak in Java means objects that are no longer needed by the application are still reachable by the GC graph and therefore never collected. The GC only frees what is unreachable; if a long-lived object keeps a reference to a short-lived object, the short-lived object lives forever.

The GC contract: The garbage collector guarantees it will collect objects with no live references. It makes no guarantee about objects that still have references, even if those references are logically meaningless. Leaks are a reference-management problem, not a GC flaw.

Why Leaks Are Subtle

Leaks do not crash the JVM immediately. They cause a slow, steady rise in heap usage — a symptom called memory creep. The application runs fine for hours or days, then old-generation GC pauses get longer, throughput drops, and eventually an OutOfMemoryError occurs. By that point, the root cause may be deep in a cache or listener registered days ago.

Pattern 1: Static Collections as Caches

The most common leak: a static field holds a collection that grows without bound because items are added but never removed.

public class RequestTracker { // static = lives as long as the class, which is the lifetime of the JVM private static final Map<String, Request> ACTIVE = new HashMap<>(); public static void register(String id, Request req) { ACTIVE.put(id, req); } // BUG: remove() is never called after the request finishes }

Fix: Remove entries when they are no longer needed, or use a bounded data structure. If you need a cache that evicts old entries automatically, use LinkedHashMap with override of removeEldestEntry, or a purpose-built cache like Caffeine.

// Self-evicting LRU cache — keeps at most MAX_SIZE entries private static final int MAX_SIZE = 1_000; private static final Map<String, Request> CACHE = Collections.synchronizedMap(new LinkedHashMap<>(MAX_SIZE, 0.75f, true) { @Override protected boolean removeEldestEntry(Map.Entry<String, Request> eldest) { return size() > MAX_SIZE; } });

Pattern 2: Forgotten Event Listeners and Callbacks

Any time an object registers itself as a listener with a longer-lived publisher, the publisher holds a reference to it. If the listener is never deregistered, it cannot be collected — and neither can anything it references.

class Dashboard implements MetricsListener { Dashboard(MetricsService svc) { svc.addListener(this); // svc is application-scoped (lives forever) // LEAK: when Dashboard is "closed", svc still holds a reference to it } @Override public void onMetric(Metric m) { /* update UI */ } // fix: provide a close() method public void close(MetricsService svc) { svc.removeListener(this); } }
Implement AutoCloseable on any object that registers callbacks or acquires resources. Callers can then use try-with-resources to guarantee cleanup, and static analysis tools like SpotBugs will warn when the object is not closed.

Pattern 3: Inner Classes Capturing Outer Instances

A non-static inner class always holds an implicit reference to its enclosing instance. If the inner class escapes to a longer-lived scope (a thread, a timer task, a framework callback), it drags the outer class with it.

class ReportGenerator { private final byte[] largeDataSet = new byte[50 * 1024 * 1024]; // 50 MB void scheduleReport(ScheduledExecutorService scheduler) { // Non-static anonymous class — captures `this` (ReportGenerator) scheduler.scheduleAtFixedRate(new Runnable() { @Override public void run() { generateReport(); } }, 0, 1, TimeUnit.HOURS); // LEAK: even if all external references to ReportGenerator are dropped, // the scheduler keeps the Runnable alive, which keeps `this` alive, // which keeps largeDataSet alive forever. } private void generateReport() { /* ... */ } }

Fix: Use a static nested class or a lambda that captures only the data it needs (not the enclosing instance), or hold only a weak reference to the outer object.

class ReportGenerator { private final byte[] largeDataSet = new byte[50 * 1024 * 1024]; void scheduleReport(ScheduledExecutorService scheduler) { // Lambda captures only `this::generateReport` — same problem if generateReport // is an instance method. Better: extract state and pass it explicitly. WeakReference<ReportGenerator> weakSelf = new WeakReference<>(this); scheduler.scheduleAtFixedRate(() -> { ReportGenerator self = weakSelf.get(); if (self != null) self.generateReport(); }, 0, 1, TimeUnit.HOURS); } private void generateReport() { /* ... */ } }

Pattern 4: ThreadLocal Variables

Thread pools reuse threads. A ThreadLocal value set on a pooled thread persists forever unless explicitly removed. In web containers (where the thread pool is managed by the server) this is a very common leak.

private static final ThreadLocal<DatabaseConnection> CONN_HOLDER = new ThreadLocal<>(); // In a servlet or filter: public void service(Request req, Response res) { CONN_HOLDER.set(openConnection()); try { handleRequest(req, res); } finally { CONN_HOLDER.remove(); // REQUIRED — without this the connection lives // on the thread forever in the pool } }
Never skip ThreadLocal.remove() in a try-finally block. Pooled threads live for the lifetime of the application, so a missing remove() effectively pins the value (and everything it references) in memory for the same duration. This can also cause correctness bugs when the same thread later handles an unrelated request and sees stale data.

Pattern 5: WeakHashMap Misuse

WeakHashMap is often recommended as a "self-cleaning cache", but it only allows its keys to be weakly referenced. If the values hold a strong reference back to the key (directly or indirectly), the key is always reachable and is never collected — the map never shrinks.

// BROKEN self-cleaning cache Map<Widget, WidgetCache> cache = new WeakHashMap<>(); class WidgetCache { final Widget owner; // strong reference back to the key! WidgetCache(Widget w) { this.owner = w; } } // Widget will never be collected because WidgetCache.owner keeps it reachable, // so the WeakHashMap entry never expires.

Spotting Leaks: Key Signals

  • Heap usage grows monotonically across multiple full GC cycles without coming back down.
  • Old-generation fill rate increases over time even under constant load.
  • GC logs show full GC running more and more frequently while reclaiming less and less memory each cycle.
  • Heap histogram (jmap -histo:live <pid>) shows a class whose instance count keeps growing but should be bounded.
  • Heap dump diff between two snapshots taken minutes apart reveals which object types accumulated.
The heap histogram is your first stop. Run jcmd <pid> GC.heap_info to see aggregate sizes, then jmap -histo:live <pid> | head -30 to rank object types by retained count. Compare the output taken five minutes apart under load — the type whose count keeps climbing is the suspect.

Defensive Patterns to Prevent Leaks

  1. Prefer bounded caches (Caffeine, Guava Cache) over raw maps for any application-scoped state.
  2. Always deregister listeners and callbacks in a close() / destroy() lifecycle method.
  3. Use static nested classes (or lambdas that capture only specific values) instead of non-static inner classes passed to long-lived objects.
  4. Pair every ThreadLocal.set() with a ThreadLocal.remove() in a finally block.
  5. Use weak or soft references purposefully: WeakReference for caches where eviction on GC pressure is acceptable; SoftReference for memory-sensitive caches that should survive minor GCs.
  6. Run heap-dump analysis as part of your load-testing pipeline — catching a leak at 1,000 requests is far cheaper than at 10,000,000.

Summary

Memory leaks in Java are entirely about the reference graph. The GC will collect anything it cannot reach; leaks happen when a long-lived root keeps a chain of references alive to objects that are logically dead. The five patterns to watch — unbounded static collections, forgotten listeners, inner classes escaping to long-lived scopes, un-removed ThreadLocal values, and misused WeakHashMap — cover the vast majority of production leaks. Instrument your applications with heap histograms, GC logs, and heap dumps, and treat a steadily rising old-generation size as a bug, not a tuning problem.