When Java CPU usage goes high, the easiest mistake is to treat it as one generic scaling problem. In reality, high CPU can come from very different sources: real application work, garbage collection pressure, retry loops, lock contention, or threads that keep waking up without making progress.
The short version: start with hot threads, then compare them with GC activity from the same incident window. Host-level CPU tells you that pressure exists. Hot thread stacks tell you whether the CPU is being spent on useful work, memory cleanup, retries, or contention.
If you want the wider Java routing view first, step back to the Java Troubleshooting Guide.
Start with hot threads, not only host CPU
Machine-level CPU charts are helpful, but they do not tell you where the pressure is coming from.
The first practical split is:
- CPU spent in application code
- CPU spent in garbage collection
- CPU wasted in retries, spins, or coordination
That distinction matters because each path leads to a different fix.
What high CPU usually looks like in production
This symptom often appears alongside:
- request latency spikes
- queue growth or backlog
- high allocation rates
- thread contention or blocked-worker side effects
- burst traffic that never fully settles back down
Sometimes the service is simply busy with legitimate work. But just as often, the CPU rise is a side effect of wasted work or a bottleneck somewhere else.
Common causes
1. Busy request paths or tight loops
Application code may be doing far more work than expected after:
- traffic growth
- payload size changes
- a new feature rollout
- accidental quadratic behavior
If hot stacks point to request parsing, serialization, filtering, sorting, or repeated transformations, the CPU rise may be real useful work that grew beyond assumptions.
2. GC overhead is too visible
Allocation churn can turn memory pressure into CPU pressure.
This often happens when:
- large temporary objects are created rapidly
- request fan-out creates many short-lived allocations
- caches or queues retain more than expected
- heap pressure forces frequent collections
In these cases, the application may look CPU-bound even though the deeper driver is memory behavior.
3. Lock contention, retries, or spin loops waste CPU
Not all CPU is productive.
Threads may repeatedly:
- wake up and retry
- poll a shared state
- contend for the same lock
- spin on availability checks
That can produce high CPU without proportional throughput.
4. Backlog pressure moves CPU to the wrong layer
If queues grow, workers saturate, or connection waits cascade, the system may spend more CPU on:
- scheduling
- retries
- timeouts
- queue management
The visible CPU spike then hides the real bottleneck.
5. Too many threads are fighting over shared state
Increasing thread count can sometimes worsen CPU behavior.
More threads may mean:
- more context switching
- more monitor contention
- more failed acquisition attempts
- more GC pressure from queued or duplicated work
This is why thread count is rarely the first setting to change.
A practical debugging order
1. Capture hot threads during the incident
Start with the threads actually consuming CPU.
Useful commands include:
top -H -p <pid>
jcmd <pid> Thread.print
Matching a hot OS thread to a Java stack is often the fastest way to move from “CPU is high” to an actual culprit.
2. Compare CPU spikes with GC activity
Look at:
- GC frequency
- GC pause timing
- allocation rate
- old generation pressure
If CPU spikes line up with GC churn, heap behavior is likely part of the story.
3. Check for retries, polling, and contention
Ask:
- are there loops that keep checking state?
- are timeouts triggering retry storms?
- are many threads contending for one monitor?
If throughput does not rise with CPU, wasted work should move up the suspect list.
4. Compare queue growth and latency with CPU rise
If queue depth and latency rise before CPU does, then high CPU may be the effect of overload rather than the original cause.
This is especially common when:
- executors are saturated
- callers retry aggressively
- downstream systems slow down
5. Only tune threads or heap after the source is clear
If the problem is real application work, scaling may help.
If the problem is wasted work or memory churn, scaling alone may just make the incident more expensive.
Example: hot loop hidden as “high CPU”
while (!ready.get()) {
// keep checking
}
This code may look harmless in a small test, but under production load it can keep cores busy without doing useful work.
A better pattern usually involves:
- waiting on a proper signal
- backing off instead of spinning
- reducing needless polling
What to change after you find the pattern
If hot stacks point to real work
Optimize the expensive path or scale the service intentionally.
If hot stacks point to GC
Reduce allocation churn, inspect retention, and follow the memory path before changing random heap flags.
If hot stacks point to retries or spins
Reduce waste first. Backoff, deduplicate, or redesign the coordination path.
If hot stacks point to contention
Shorten critical sections and revisit shared-state design before adding threads.
If CPU rises behind queue growth
Treat the queue and backpressure problem first, because CPU may be downstream of that incident.
A useful incident question
Ask this:
Is the CPU being spent on useful application work, memory cleanup, or work that should not exist at all?
That question is much more actionable than “Why is CPU high?”
FAQ
Q. Does high CPU always mean not enough threads?
No. Extra threads can make contention, scheduling overhead, and GC pressure worse.
Q. What is the fastest first step?
Capture hot threads and compare them with GC activity from the same incident window.
Q. Should I scale the service first?
Only after you know whether the CPU rise comes from real work, wasted work, or memory pressure.
Q. Can queue backlog cause CPU spikes too?
Yes. Retries, scheduling churn, and coordination overhead can all rise when the system falls behind.
Read Next
- If GC overhead appears to be driving the spike, continue with Java GC Pauses Too Long.
- If the service is stalling rather than only running hot, compare with Java Thread Deadlock.
- If contention shows up in hot stacks, check Java Thread Contention High.
- If queued work also keeps rising, compare with Java Thread Pool Queue Keeps Growing.
- If you need the wider symptom map again, return to the Java Troubleshooting Guide.
Related Posts
- Java GC Pauses Too Long
- Java Thread Deadlock
- Java Thread Contention High
- Java Thread Pool Queue Keeps Growing
- Java Troubleshooting Guide
Sources:
While AdSense review is pending, related guides are shown instead of ads.
Start Here
Continue with the core guides that pull steady search traffic.
- Middleware Troubleshooting Guide: Redis vs RabbitMQ vs Kafka A practical middleware troubleshooting guide for developers covering when to reach for Redis, RabbitMQ, or Kafka symptoms first, and which problem patterns usually belong to each tool.
- Kubernetes CrashLoopBackOff: What to Check First A practical Kubernetes CrashLoopBackOff troubleshooting guide covering startup failures, probe issues, config mistakes, and what to inspect first.
- Kafka Consumer Lag Increasing: Troubleshooting Guide A practical Kafka consumer lag troubleshooting guide covering what lag usually means, which consumer metrics to check first, and how poll timing, processing speed, and fetch patterns affect lag.
- Kafka Rebalancing Too Often: Common Causes and Fixes A practical Kafka troubleshooting guide covering why consumer groups rebalance too often, what poll timing and group protocol settings matter, and how to stop rebalances from interrupting useful work.
- Docker Container Keeps Restarting: What to Check First A practical Docker restart-loop troubleshooting guide covering exit codes, command failures, environment mistakes, health checks, and what to inspect first.
While AdSense review is pending, related guides are shown instead of ads.