Java Thread Deadlock: Common Causes and Fixes
Last updated on

Java Thread Deadlock: Common Causes and Fixes


When a Java service stops making progress, the problem may be a real thread deadlock, heavy lock contention that looks similar, or a queue stalled behind a small number of blocked workers. Those situations can feel identical from the outside, but they do not have the same fix.

The short version: separate true deadlock from generic waiting before you restart the process. A real deadlock means at least two execution paths are waiting on each other in a cycle. High contention and worker starvation may look frozen too, but the thread states tell a different story.

If you want the wider Java routing view first, step back to the Java Troubleshooting Guide.


Start with thread state and lock ownership

Deadlock is fundamentally about a wait cycle.

That makes these artifacts more useful than request latency alone:

  • thread dumps
  • monitor ownership
  • lock ordering
  • repeated blocked/waiting relationships

If you do not capture that state before restart, you often lose the only clear proof of what happened.


What deadlock usually looks like

In production, a suspected deadlock often appears as:

  • requests hanging indefinitely
  • worker threads staying blocked for the same long interval
  • queues growing behind a small set of stuck threads
  • very low throughput even when CPU is not especially high
  • repeated thread dumps showing the same wait relationships

If the same threads keep waiting on the same locks across multiple dumps, a true cycle becomes much more plausible.


Common causes

1. Lock ordering is inconsistent

This is the classic cause.

One code path acquires locks in one order, while another path acquires the same locks in the opposite order.

synchronized (a) {
    synchronized (b) {
        // work
    }
}

If another path takes b and then a, deadlock only needs the right timing to appear.

2. Multiple locks are held across broad critical sections

Even if lock ordering is mostly safe, large synchronized scopes increase the chance that two flows overlap badly.

This becomes more dangerous when:

  • several locks are nested
  • critical sections do more than state mutation
  • shared objects are touched by many request paths

3. Blocking work happens inside synchronized sections

I/O or slow work inside a lock does not automatically create a true deadlock, but it can make contention and waiting cascades much worse.

It also makes the incident harder to interpret because many threads pile up behind one stalled path.

4. Worker starvation hides the real issue

Queues may keep growing because only a few threads are stuck while the rest wait behind them.

In that case, operators may diagnose “deadlock” when the real issue is:

  • too few free workers
  • same-pool nested waits
  • one stuck dependency path blocking everyone else

5. It is heavy contention, not deadlock

This is a very common false alarm.

High contention can make a service look frozen even when there is no strict cyclic wait. Progress is still possible, just painfully slow.


A practical debugging order

1. Capture thread dumps from the incident window

Take more than one if possible.

You want to know whether the same threads remain:

  • BLOCKED
  • WAITING
  • TIMED_WAITING

around the same locks and owners across time.

2. Identify owner, waiting, and blocked relationships

For each suspicious lock, find:

  • which thread owns it
  • which thread is waiting for it
  • whether the owner is itself waiting on another lock

This is how a cycle becomes visible.

3. Check lock order across code paths

Search for the synchronized or lock-taking paths involved and compare their acquisition order.

If the same pair of locks is taken in different order in different code paths, the root cause becomes much clearer.

4. Compare blocked threads with queue growth and worker starvation

If backlog is growing but only a few threads are truly stuck, you may be looking at downstream blocking or executor starvation rather than a lock cycle.

5. Only restart after you preserve enough evidence

Restarting may restore availability, but it also destroys the state you need to fix the actual problem.

If the service must be restarted, capture as much thread and lock state as you can first.


Example: opposite lock ordering

// path 1
synchronized (a) {
    synchronized (b) {
        update();
    }
}

// path 2
synchronized (b) {
    synchronized (a) {
        update();
    }
}

This code may run for a long time without incident. Then one day the timing lines up under load and the deadlock finally appears.

That is why “it worked in tests” does not rule out lock-order bugs.


What to change after you confirm the issue

Enforce one lock order everywhere

This is the most direct fix for classic deadlock.

Reduce nested locking

If too many paths hold multiple locks at once, simplify ownership and critical sections.

Move slow work outside synchronized sections

Even when it is not the root deadlock, slow work inside locks magnifies incidents.

Separate deadlock from starvation patterns

If the real issue is pool starvation or queue buildup, fix that path instead of focusing only on monitor cycles.

Add better diagnostic hooks

Thread dumps, blocked thread metrics, and lock-related incident playbooks reduce guesswork the next time this happens.


A useful incident question

Ask this:

Are two or more threads waiting on each other in a stable cycle, or are many threads simply piling up behind one slow or contended path?

That distinction changes the fix completely.


FAQ

Q. Is every blocked thread a deadlock?

No. Many incidents are contention or starvation rather than a true cyclic wait.

Q. What is the fastest first step?

Take thread dumps and look for repeated waiting cycles around the same locks.

Q. Should I restart immediately?

Only after you capture enough state to understand whether the problem is deadlock, contention, or backlog.

Q. Can CPU still be high during deadlock investigations?

Yes. Some threads may still spin, retry, or process backlog while the truly deadlocked threads remain stuck.


Sources:

Start Here

Continue with the core guides that pull steady search traffic.