The First-Level Cache
The First-Level Cache
Every EntityManager in JPA carries a built-in cache called the first-level cache (also known as the session cache or persistence context cache). Unlike a shared cache that lives between requests, this cache is scoped to a single EntityManager instance — it lives and dies with one unit of work. Understanding it is essential because it shapes every query you write, every object you compare, and every performance decision you make.
What the First-Level Cache Is
Hibernate maintains an internal identity map keyed by entity type plus primary key. The moment you load, persist, or merge an entity, Hibernate registers it in this map. Every subsequent lookup for the same type and ID within the same EntityManager returns the exact same Java object — no round-trip to the database, no second instantiation.
This cache is always on. There is no configuration switch to disable it. It is fundamental to how JPA's object-identity guarantee works: within a persistence context, the same database row is always represented by the same Java reference.
find() calls for the same entity class and primary key must return the same object instance. Hibernate fulfills this via the first-level cache.
Identity Guarantee in Practice
Consider this scenario inside a Spring service method annotated with @Transactional:
Hibernate fires a single SELECT statement for the first find() call. On the second call it checks the identity map, finds an entry for (Order, orderId), and returns the same reference. No SQL is generated. You can confirm this by enabling SQL logging in application.properties:
How It Interacts with Lazy Loading
The first-level cache does not just store root entities. When Hibernate resolves a lazy association the result is also stored. If two different entities both reference the same related entity (for example, two OrderItem rows that share the same Product), Hibernate returns a single shared Product instance rather than creating a duplicate:
This is not merely a performance convenience. It prevents lost-update anomalies: if you modify p1, p2 reflects the change immediately because they are the same object in memory.
Bypassing the Cache: When You Need Fresh Data
Because the first-level cache is scoped to the current EntityManager, it is never stale with respect to writes you made in the same transaction. However, another transaction on another thread could modify the same row concurrently. If you need to discard the cached state and reload from the database, use refresh():
refresh(). It always hits the database and bypasses any optimistic-locking version checks. Reserve it for situations where you know an external process has changed a row — for example, after calling a stored procedure or a batch job that operates outside of your JPA session.
Evicting Individual Entries
You can also evict a specific entity from the identity map without reloading it, using detach():
After detach(), changes made to the old reference are no longer tracked by Hibernate's dirty-checking mechanism. The entity enters the detached state covered in the previous lesson.
Clearing the Entire Context
For batch processing — importing thousands of rows in a loop — the first-level cache becomes a liability. Every entity you persist accumulates in memory, and Hibernate's flush-time dirty checking iterates all of them. The standard pattern is to flush and clear periodically:
After clear(), the identity map is empty. Any entity reference you held before the call is now detached. This pattern keeps heap usage constant regardless of how many rows you process.
flush() and clear() in batches, never clear() alone. If you clear without flushing first, pending inserts or updates are discarded and never written to the database. Always flush before you clear.
The Cache and JPQL Queries
There is an important subtlety: JPQL (and Criteria API) queries go directly to the database. They do not consult the first-level cache before executing SQL. However, after Hibernate receives the result rows it merges them into the identity map. If a returned row is already in the cache, Hibernate returns the cached instance (ignoring the fresh column values from the query result, unless you use LockModeType.PESSIMISTIC_WRITE or explicitly refresh). This can lead to stale reads if you modify an entity and then query for it within the same transaction without flushing first:
Hibernate auto-flushes before queries in FlushModeType.AUTO (the default) only when the query targets an entity type that has pending changes. This is usually correct, but understanding the mechanism helps you reason about query results in complex transactions.
Summary
The first-level cache is a mandatory, per-EntityManager identity map that eliminates redundant SQL lookups and guarantees object identity within a persistence context. Use refresh() to force a reload from the database, detach() to evict a single entity, and the flush() + clear() pattern to keep batch operations memory-efficient. Awareness of how JPQL queries interact with the cache — bypassing it on the way out but merging results into it on the way back — is essential for avoiding hard-to-diagnose stale-read bugs.