Mar 23, 2026

Last updated on Mar 31, 2026

Redis Latency Spikes: What to Check First

Redis latency spikes become hard to debug when teams treat every slowdown as the same class of incident. In one case the server is busy running expensive commands. In another, Redis is fast but the host, hypervisor, or network path is already noisy. In another, persistence or memory pressure adds pauses that only show up during certain windows.

The short version: split the incident into command cost, environment baseline, network path, and persistence or memory side effects before changing Redis settings. That branching step is what keeps a latency investigation from turning into random tuning.

Start by separating four different latency buckets

Redis latency usually comes from one or more of these buckets:

expensive Redis commands
intrinsic environment or host latency
network and client round-trip delay
persistence, memory pressure, or swapping side effects

If you do not separate those buckets early, almost every attempted fix becomes guesswork. A command tuning change will not solve noisy virtualization, and a networking change will not fix a blocking Lua script or a big-key delete.

Slow commands are only one part of the story

Slow commands are a common cause, so SLOWLOG is still one of the first tools to inspect.

But a clean slowlog does not mean Redis is healthy. Latency spikes can still come from:

fork or rewrite activity during persistence
host-level jitter or virtualization overhead
memory pressure and swapping
too many client round trips

That is why a good incident review has to branch instead of assuming every spike is a command problem.

Use latency monitoring when you need event-level clues

Redis exposes latency monitoring for cases where the system feels slow but the application logs do not clearly tell you why.

A practical start is:

redis-cli CONFIG SET latency-monitor-threshold 100
redis-cli LATENCY LATEST
redis-cli LATENCY HISTORY command
redis-cli LATENCY DOCTOR

This helps when spikes are real but not obviously tied to one query path. If a team says “the cache feels random” or “p95 jumps for a few minutes,” latency monitor often gives you the first useful time-aligned clue.

Do not ignore intrinsic latency and networking

Redis documentation emphasizes that your operating system, hypervisor, and network create a baseline you cannot beat.

Ask:

is Redis running in a noisy virtualized environment?
is the client far from the Redis node?
are too many sequential round trips happening?
would pipelining reduce visible delay?

Sometimes Redis is not the bottleneck. The surrounding path is. That distinction matters because teams often increase Redis resources when the real problem is chatty client behavior or a poor runtime environment.

Persistence, swapping, and memory pressure can create ugly spikes

Redis latency often gets worse when:

memory is tight
the kernel swaps
persistence fork and rewrite work overlap with traffic
disk behavior becomes slow or unstable

If spikes line up with save or rewrite windows, compare the incident with Redis Persistence Latency. If memory is also rising, compare it with Redis Memory Usage High.

These are classic cases where the visible symptom is “Redis latency,” but the operational cause is broader system work happening around Redis.

Big keys often turn normal commands into spike generators

One oversized key can make ordinary reads, writes, deletes, expirations, and rewrites far more expensive than expected.

That is why a team may think “Redis is randomly spiking” when the real story is:

one key family became too large
one feature touched that family in a burst
normal commands suddenly became expensive

If a spike clusters around one feature or one key family, Redis Big Keys is often the next best guide.

A practical debugging order

Use this order during an incident:

inspect SLOWLOG
inspect latency monitor events
compare spikes against persistence and save windows
check whether memory pressure or swapping is involved
test whether host or network baseline is already too high

This sequence usually gets you closer to the real cause than changing timeouts or Redis config blindly.

A quick command set for the first 10 minutes

redis-cli SLOWLOG GET 10
redis-cli LATENCY LATEST
redis-cli INFO memory
redis-cli INFO persistence
redis-cli INFO stats

Read those outputs together instead of one by one. A slowlog-heavy incident points toward command cost. Clean slowlog plus persistence activity points somewhere else. Rising memory pressure plus latency events often means the spike is part of a broader resource issue.

What teams often miss

Latency spikes are often mixed incidents.

For example:

one command family becomes slower
the host baseline is already noisy
persistence windows make the peak worse

In those cases, looking for only one root cause makes the incident feel more mysterious than it really is. Redis can be both a victim and a contributor in the same outage window.

A practical question to keep asking

During a spike, do not ask only “what command was slow?” Ask “which layer became slow first?”

That framing helps separate:

a Redis execution problem
an environment baseline problem
a persistence or memory-side effect problem

That is often the difference between fixing the actual bottleneck and only tuning the most visible symptom.

FAQ

Q. Are Redis latency spikes always caused by Redis commands?

No. They can also come from networking, the operating system, swapping, or persistence side effects.

Q. What is the fastest first step?

Inspect SLOWLOG, then compare it with latency monitor events from the same incident window.

Q. When should I suspect big keys?

When spikes cluster around one feature, one key family, or one command path touching unusually large data.

Q. If slowlog is clean, can Redis still feel slow?

Yes. Persistence, host latency, and memory pressure can all create visible spikes.

Start Here

Continue with the core guides that pull steady search traffic.