Java OutOfMemoryError: Common Causes and Fixes
Last updated on

Java OutOfMemoryError: Common Causes and Fixes


When a Java service hits OutOfMemoryError, the fastest mistake is to treat every case like a generic heap problem. Java memory incidents are often less about “memory is full” and more about which memory area is under pressure and why that pressure built up.

The short version: capture the exact OutOfMemoryError variant first. Heap pressure, metaspace growth, direct memory exhaustion, and large queued backlogs do not point to the same fix path.

If you want the wider Java routing view first, step back to the Java Troubleshooting Guide.


Start with the exact error shape

Different OutOfMemoryError messages mean different bottlenecks.

For example:

  • Java heap space
  • GC overhead limit exceeded
  • Metaspace
  • Direct buffer memory

These do not imply the same root cause, and they should not trigger the same response.

That is why the first job is not “increase heap” but “identify which memory area is failing.”


What OOM incidents often look like in production

Before the crash or forced restart, you may see:

  • queue backlog continuing to grow
  • GC activity increasing sharply
  • latency getting worse before the process dies
  • container memory limits reached even when heap sizing looks reasonable
  • deployment or traffic changes exposing old retention assumptions

An OOM is usually the end of a story that started earlier with retention, backlog, class loading, or native pressure.


Common causes

1. Heap retained by application objects

This is the most familiar pattern.

Large objects or too many retained references can slowly fill heap:

  • large collections
  • caches without clear bounds
  • in-memory queues
  • request payload retention
  • response aggregation buffers

The heap may not leak forever, but if objects live much longer than expected, the JVM can still fail under real traffic.

2. Metaspace growth

Not every OOM is about ordinary objects.

Heavy class loading, dynamic proxies, bytecode generation, or repeated classloader churn can push metaspace much higher than expected.

This is especially relevant in systems with:

  • plugin loading
  • dynamic frameworks
  • repeated redeploy patterns
  • custom classloader usage

3. Direct or native memory pressure

Some incidents happen outside the normal heap story.

Examples include:

  • direct byte buffers
  • JNI allocations
  • off-heap caches
  • process memory overhead in containers

In those cases, heap metrics may look acceptable while the process still fails.

4. Queue backlog and retained work inflate memory

This is often missed.

If thread pools, messaging buffers, or request backlogs keep growing, the queued work itself can retain many objects at once.

That means the root problem may be throughput collapse rather than a classic object leak.

5. Capacity assumptions are wrong

Traffic, payload size, tenant count, or data shape may have outgrown the original JVM sizing and queue design.

Sometimes the code did not change much, but the workload did.


A practical debugging order

1. Capture the exact OutOfMemoryError variant

Do not summarize it as just “OOM.”

The exact message narrows the search space immediately.

2. Identify whether pressure is heap, metaspace, or native

This single distinction prevents many wasted hours.

Heap tuning will not solve direct buffer exhaustion. Metaspace fixes will not solve queue retention.

3. Inspect caches, collections, queues, and payload-heavy paths

Look for the places where the application can retain far more data than expected.

Ask:

  • what grows with traffic?
  • what grows with retries or backlog?
  • what has no clear upper bound?

4. Compare recent traffic and deployment changes

The incident may follow:

  • a new feature path
  • larger request payloads
  • more concurrent work
  • changed cache behavior
  • classloading differences

OOM incidents often make more sense when viewed as a workload shift.

5. Change JVM sizing only after the pressure source is clear

More memory can buy time, but it should not replace diagnosis.

If the pressure source is retention or runaway backlog, a larger heap may only delay the same failure.


Example: queue backlog causing heap pressure

ExecutorService pool = Executors.newFixedThreadPool(8);
for (Task t : tasks) {
    pool.submit(() -> process(t));
}

If process(t) slows down and incoming work keeps arriving, queued tasks may retain payload objects, references, and closures long enough to turn a throughput incident into a heap incident.

That is why some OOMs are really backlog problems wearing a memory-shaped mask.


A useful JVM option

java -XX:+HeapDumpOnOutOfMemoryError -jar app.jar

Capturing a heap dump on the first OOM gives you a concrete object graph instead of guessing after the process is gone.

If storage and policy allow it, this is often one of the highest-value safeguards you can enable.


What to change after you find the pressure source

If heap retention is the issue

Reduce retention, bound caches and queues, and remove long-lived references.

If metaspace is the issue

Inspect classloader behavior, dynamic code generation, and redeploy patterns.

If native or direct memory is the issue

Trace off-heap usage and container memory assumptions instead of focusing only on heap charts.

If backlog is the issue

Treat queue growth and throughput collapse as the primary incident, not a secondary detail.

If the workload simply outgrew sizing

Resize intentionally, but only after you understand the path consuming memory.


A useful incident question

Ask this:

Which memory area actually failed, and what was growing there before the JVM died?

That question is much more actionable than “Should we raise heap?”


FAQ

Q. Should I just increase heap size first?

Not before you know which memory area is actually failing.

Q. Can thread pools cause memory pressure too?

Yes. Large backlogs and queued work can retain many objects at once.

Q. What is the fastest first step?

Capture the exact error variant and map it to the affected memory area.

Q. Is every OOM a memory leak?

No. Some are classic leaks, but others come from backlog, larger workloads, or configuration that no longer matches reality.


Sources:

Start Here

Continue with the core guides that pull steady search traffic.