Race Conditions & Shared State
Race Conditions & Shared State
A race condition is a bug where the correctness of a program depends on the relative timing of multiple threads. It appears intermittently, survives most test runs, and tends to surface in production under load — making it one of the hardest categories of bug to diagnose and fix.
The root cause is always the same: two or more threads read and write a shared mutable variable without coordination.
A Simple Counter Gone Wrong
Consider the most minimal example: two threads each increment a shared counter 100,000 times.
Run this a few times and you will see values like 134,827 or 189,003 — never reliably 200,000. The reason is that counter++ is not atomic.
The Check-Then-Act Problem
At the bytecode level, counter++ decomposes into three distinct operations:
- Read — load the current value of
counterfrom main memory into a CPU register. - Increment — add 1 inside the register.
- Write — store the result back to main memory.
The thread scheduler can preempt a thread between any two of these steps. Here is a concrete interleaving that loses an update:
This is called a lost update. Both threads believed they were working with 42 and both wrote 43, so the net effect is +1 instead of +2. Multiply this across 200,000 iterations and the loss is substantial.
if (map.get(k) == null) map.put(k, v) is a check-then-act sequence. Between the check and the act, another thread can change the map. The same rule applies to i++, list.size() == 0 ? list.add(x) : ..., and countless other idioms that look like single steps.
Why the JMM Makes It Worse
Beyond scheduling interleaving, the Java Memory Model (JMM) permits each thread to cache variables in its own local CPU cache. Thread 1 may read a stale copy of counter that was never flushed back to main memory by Thread 2. This is a visibility problem — distinct from, and additive to, the atomicity problem.
So with unsynchronized shared state you face two independent hazards simultaneously:
- Atomicity failure — a compound operation is interrupted mid-way.
- Visibility failure — a thread never sees another thread's write at all.
Detecting Races in Practice
Race conditions are hard to reproduce reliably because they depend on OS scheduling decisions, CPU count, JIT compilation warmup, and load. Standard techniques for finding them:
- Stress testing — run the code with many threads for a long time. The more CPUs, the more interleavings, the more likely the bug manifests.
- Thread sanitizers — tools like ThreadSanitizer (in native code) or Java PathFinder enumerate interleavings systematically.
- Code review — every field accessed by more than one thread without synchronization is a suspect.
Another Classic Pattern: Lazy Initialisation
Lazy initialisation without synchronisation is a textbook race:
Two threads can pass the null check simultaneously. Each creates its own Registry, stores it, and the second write overwrites the first — but any code that already captured the reference from the first write now holds a stale, abandoned object. In more complex initialisation this can leave the object in a partially constructed state visible to one thread but not another.
synchronized, volatile with the double-checked locking pattern, or an initialisation-on-demand holder class. These are covered in later lessons.
Identifying Shared Mutable State
Not all shared state causes races. The key questions are:
- Is it mutable? A
finalfield written once in a constructor and never changed is safe to share across threads. - Is it reachable from multiple threads? A local variable on the stack is never shared — each thread has its own stack frame.
- Is there at least one writer? If all threads only read, there is no race (though visibility rules still apply for initial publication).
If the answer to all three is yes, the code is a candidate for a race condition and must be protected.
What Synchronisation Is Not
It is tempting to believe that making operations fast eliminates races — it does not. A race is a structural problem in code, not a performance problem. The only remedies are:
- Eliminate mutability — use immutable objects; share freely.
- Eliminate sharing — give each thread its own copy via
ThreadLocalor by design. - Coordinate access — use synchronisation primitives (
synchronized,volatile,java.util.concurrentclasses).
The next lessons cover each of these remedies. For now, the essential insight is that a race condition is the absence of a happens-before relationship between a write on one thread and a read on another.
Summary
Race conditions arise when threads share mutable state without coordination. The counter++ example shows how an operation that looks atomic is actually read-increment-write, and any interleaving of those steps across threads produces incorrect results. The JMM compounds the problem by allowing each thread to see a locally cached, stale view of memory. Races are non-deterministic, load-sensitive, and often invisible in tests. The only solutions are to remove mutability, remove sharing, or add explicit synchronisation — the subject of the remaining lessons in this tutorial.