Common Performance Pitfalls
Common Performance Pitfalls
Most production Java performance problems are not exotic — they are a handful of well-known anti-patterns repeated at scale. This lesson examines three that appear constantly in code reviews and profiling sessions: autoboxing overhead, string concatenation in loops, and under-sized or over-sized collections. Understanding the why behind each lets you make disciplined, evidence-based decisions instead of guessing.
1. Autoboxing and Unboxing
Java's type system separates primitive types (int, long, double, …) from their wrapper counterparts (Integer, Long, Double, …). Autoboxing is the compiler's automatic conversion between the two. While it enables ergonomic APIs, it hides real costs: heap allocation, GC pressure, and cache misses.
The compiler desugars total += i into roughly:
Ten million Long objects are allocated and immediately discarded, exercising the GC and blowing the CPU cache. The fix is trivially simple:
On a modern JVM the primitive version is roughly 5–10× faster and allocates nothing. The rule is: use primitives in computation-heavy paths; reserve wrappers for when the API forces it (generics, nullable fields, collections).
Map<String, Integer> stores boxed integers. Every map.get(key) hands back an Integer; using it arithmetically immediately unboxes it. If you are iterating a map and summing values, consider int[] arrays or Eclipse Collections / Koloboke primitive maps for hot paths.
A second autoboxing trap is equality comparison:
The JVM caches Integer instances for values −128 to 127 (configurable via -XX:AutoBoxCacheMax), so == happens to return true for small values — a bug that appears only in production with large numbers. Always use equals() for wrapper comparison.
2. String Concatenation in Loops
String is immutable. Every + or += on a String creates a new String object and copies all existing characters into it. In a loop this produces O(n²) character copies — the classic "Schlemiel the Painter" algorithm.
With 10,000 lines, iteration 5,000 copies ~5,000 characters just for the existing content, then copies the new line on top. Total work is proportional to n². The JVM's JIT does not automatically hoist this into a StringBuilder inside a loop body.
"Hello " + name + "!" outside a loop is rewritten to a single StringBuilder sequence by javac (Java 9+ uses invokedynamic + StringConcatFactory which is even faster). The problem is exclusively when concatenation occurs inside a loop, because the compiler cannot know iteration count at compile time.
For building structured text at scale prefer String.join(), StringJoiner, or Collectors.joining() from streams — all use StringBuilder internally:
3. Poor Collection Sizing
Java's resizable collections — ArrayList, HashMap, HashSet, StringBuilder — all start small and grow by doubling when full. Doubling means: allocate a new array roughly twice as large, copy every existing element, then discard the old array. If you add one million elements to a default-capacity ArrayList, you trigger ~20 resize-and-copy cycles and produce ~20 discarded arrays for the GC to collect.
For HashMap and HashSet the calculation is slightly different because the default load factor is 0.75 — meaning the map resizes when it is 75 % full. To avoid any resize when you know the final size, pass capacity = expectedSize / 0.75 + 1:
Guava's Maps.newHashMapWithExpectedSize(n) encapsulates that formula so you never mis-calculate:
new ArrayList<>(10_000_000) when you end up storing 100 items wastes 10 M element slots of heap. Pre-size when you have a good estimate; otherwise let the collection grow naturally. The sweet spot is a reasonable upper-bound estimate, not the worst-case maximum.
Measuring, Not Guessing
Each pitfall above is invisible without measurement. Before optimising:
- Profile first (JFR / async-profiler / VisualVM) to confirm the hot path.
- Benchmark with JMH to get a reproducible before/after number.
- Apply one change at a time and re-measure.
Blind optimisation often moves work from a cold path to a hot one. The three pitfalls in this lesson are safe to fix pre-emptively only in tight loops or critical paths where the cost is already well-understood.
Summary
- Autoboxing: use primitives in hot loops; be suspicious of wrapper types in arithmetic; use
equals()for wrapper comparison. - String concatenation: never use
+=in a loop; useStringBuilder,StringJoiner, orCollectors.joining(). - Collection sizing: pre-size when you know the approximate final size; remember the 0.75 load-factor for maps; avoid wildly over-sizing.