Recently, I encountered a case where a Redis instance experienced a sudden memory surge, reaching a maximum
used_memory
of 78.9G, while the instance'smaxmemory
configuration was only 16G. This ultimately led to a significant amount of data being evicted from the instance.
Below are part of the output from the INFO MEMORY command at the time the problem occurred:
# Memory used_memory:84716542624 used_memory_human:78.90G used_memory_rss:104497676288 used_memory_rss_human:97.32G used_memory_peak:84716542624 used_memory_peak_human:78.90G used_memory_peak_perc:100.00% used_memory_overhead:75682545624 used_memory_startup:906952 used_memory_dataset:9033997000 used_memory_dataset_perc:10.66% allocator_allocated:84715102264 allocator_active:101370822656 allocator_resident:102303637504 total_system_memory:810745470976 total_system_memory_human:755.07G used_memory_lua:142336 used_memory_lua_human:139.00K used_memory_scripts:6576 used_memory_scripts_human:6.42K number_of_cached_scripts:13 maxmemory:17179869184 maxmemory_human:16.00G maxmemory_policy:volatile-lru allocator_frag_ratio:1.20 allocator_frag_bytes:16655720392
Memory surges leading to data eviction are a common issue in Redis. Many people often lack a clear analytical approach when facing such problems and mistakenly believe they are caused by operations like replication or RDB persistence. Next, let’s look at how to systematically analyze these types of issues.
This article mainly includes the following sections:
How is
used_memory
derived from INFO?What is
used_memory
?What scenarios is
used_memory
typically used for?Changes in memory statistics in Redis 7.
Trigger conditions for data eviction—does exceeding
maxmemory
always trigger eviction?Finally, I'll share a script to help analyze in real-time which specific part of memory consumption is causing the growth in
used_memory
.
How is used_memory
Derived from INFO?
When we execute the INFO command, Redis calls the genRedisInfoString
function to generate its output.
// server.c sds genRedisInfoString(const char *section) { ... /* Memory */ if (allsections || defsections || !strcasecmp(section,"memory")) { ... size_t zmalloc_used = zmalloc_used_memory(); ... if (sections++) info = sdscat(info," "); info = sdscatprintf(info, "# Memory " "used_memory:%zu " "used_memory_human:%s " "used_memory_rss:%zu " ... "lazyfreed_objects:%zu ", zmalloc_used, hmem, server.cron_malloc_stats.process_rss, ... lazyfreeGetFreedObjectsCount() ); freeMemoryOverheadData(mh); } ... return info; }
As we can see, the value of used_memory
comes from zmalloc_used
, which is obtained through the zmalloc_used_memory()
function.
// zmalloc.c size_t zmalloc_used_memory(void) { size_t um; atomicGet(used_memory,um); return um; }
The implementation of zmalloc_used_memory()
is straightforward; it atomically reads the value of used_memory
.
What is used_memory
?
used_memory
is a static variable of type redisAtomic size_t
, where redisAtomic
is an alias for _Atomic
. _Atomic
is a keyword introduced in the C11 standard to declare atomic types, ensuring that operations on this type are atomic in a multithreaded environment, thus avoiding data races.
#define redisAtomic _Atomic static redisAtomic size_t used_memory = 0;
The update of used_memory
is mainly achieved through two macro definitions:
#define update_zmalloc_stat_alloc(__n) atomicIncr(used_memory,(__n)) #define update_zmalloc_stat_free(__n) atomicDecr(used_memory,(__n))
The update_zmalloc_stat_alloc(__n)
macro is called when memory is allocated, incrementing used_memory
by __n
through atomic operation.
The update_zmalloc_stat_free(__n)
macro is called when memory is freed, decrementing used_memory
by __n
through atomic operation.
These two macros ensure accurate updates to used_memory
during memory allocation and deallocation, avoiding data races caused by concurrent operations.
When allocating or freeing memory through the allocator (commonly using glibc's malloc, jemalloc, or tcmalloc—Redis generally uses jemalloc), the update_zmalloc_stat_alloc
or update_zmalloc_stat_free
functions are called to update the value of used_memory
.
In Redis, memory management is primarily implemented through the following two functions:
// zmalloc.c void *ztrymalloc_usable(size_t size, size_t *usable) { ASSERT_NO_SIZE_OVERFLOW(size); void *ptr = malloc(MALLOC_MIN_SIZE(size)+PREFIX_SIZE); if (!ptr) return NULL; #ifdef HAVE_MALLOC_SIZE size = zmalloc_size(ptr); update_zmalloc_stat_alloc(size); if (usable) *usable = size; return ptr; #else ... #endif } void zfree(void *ptr) { ... if (ptr == NULL) return; #ifdef HAVE_MALLOC_SIZE update_zmalloc_stat_free(zmalloc_size(ptr)); free(ptr); #else ... #endif }
Where:
The
ztrymalloc_usable
function is used to allocate memory. It first callsmalloc
to allocate memory. If successful, it updatesused_memory
throughupdate_zmalloc_stat_alloc
.The
zfree
function is used to free memory. Before releasing memory, it adjustsused_memory
throughupdate_zmalloc_stat_free
, and then callsfree
to release the memory.
This mechanism ensures that Redis can accurately track memory allocation and deallocation, effectively managing memory usage.
In what scenarios is used_memory
typically utilized?
used_memory
consists of two main components:
The data itself: Corresponding to
used_memory_dataset
in theINFO
command.Overhead for internal management and maintenance of data structures: Corresponding to
used_memory_overhead
in theINFO
command.
It is important to note that used_memory_dataset
is not calculated based on the number of keys and the memory used by the keys, but is instead derived by subtracting used_memory_overhead
from used_memory
.
Next, let's focus on analyzing the source of used_memory_overhead
. In fact, Redis provides a dedicated function getMemoryOverheadData
specifically to calculate this part of the memory overhead.
// object.c struct redisMemOverhead *getMemoryOverheadData(void) { int j; // mem_total is used to accumulate the total memory overhead and will eventually be assigned to `used_memory_overhead`. size_t mem_total = 0; // mem is used to calculate the memory usage of each component. size_t mem = 0; // Call `zmalloc_used_memory()` to get `used_memory`. size_t zmalloc_used = zmalloc_used_memory(); // Allocate memory for a `redisMemOverhead` structure using `zcalloc`. struct redisMemOverhead *mh = zcalloc(sizeof(*mh)); ... // Add the memory usage at Redis startup `server.initial_memory_usage` to the total overhead. mem_total += server.initial_memory_usage; mem = 0; // Add the memory overhead of the replication backlog buffer. if (server.repl_backlog) mem += zmalloc_size(server.repl_backlog); mh->repl_backlog = mem; mem_total += mem; /* Computing the memory used by the clients would be O(N) if done * here online. We use our values computed incrementally by * clientsCronTrackClientsMemUsage(). */ // Calculate the memory overhead of clients. mh->clients_slaves = server.stat_clients_type_memory[CLIENT_TYPE_SLAVE]; mh->clients_normal = server.stat_clients_type_memory[CLIENT_TYPE_MASTER]+ server.stat_clients_type_memory[CLIENT_TYPE_PUBSUB]+ server.stat_clients_type_memory[CLIENT_TYPE_NORMAL]; mem_total += mh->clients_slaves; mem_total += mh->clients_normal; // Calculate the memory overhead of AOF buffers and AOF rewrite buffers. mem = 0; if (server.aof_state != AOF_OFF) { mem += sdsZmallocSize(server.aof_buf); mem += aofRewriteBufferSize(); } mh->aof_buffer = mem; mem_total += mem; // Calculate the memory overhead of Lua script caches. mem = server.lua_scripts_mem; mem += dictSize(server.lua_scripts) * sizeof(dictEntry) + dictSlots(server.lua_scripts) * sizeof(dictEntry*); mem += dictSize(server.repl_scriptcache_dict) * sizeof(dictEntry) + dictSlots(server.repl_scriptcache_dict) * sizeof(dictEntry*); if (listLength(server.repl_scriptcache_fifo) > 0) { mem += listLength(server.repl_scriptcache_fifo) * (sizeof(listNode) + sdsZmallocSize(listNodeValue(listFirst(server.repl_scriptcache_fifo)))); } mh->lua_caches = mem; mem_total += mem; // Calculate the memory overhead of databases: iterate over all databases (`server.dbnum`). For each database, calculate the memory overhead of the main dictionary (`db->dict`) and the expiration dictionary (`db->expires`). for (j = 0; j < server.dbnum; j++) { redisDb *db = server.db + j; long long keyscount = dictSize(db->dict); if (keyscount == 0) continue; mh->total_keys += keyscount; mh->db = zrealloc(mh->db, sizeof(mh->db[0]) * (mh->num_dbs + 1)); mh->db[mh->num_dbs].dbid = j; mem = dictSize(db->dict) * sizeof(dictEntry) + dictSlots(db->dict) * sizeof(dictEntry*) + dictSize(db->dict) * sizeof(robj); mh->db[mh->num_dbs].overhead_ht_main = mem; mem_total += mem; mem = dictSize(db->expires) * sizeof(dictEntry) + dictSlots(db->expires) * sizeof(dictEntry*); mh->db[mh->num_dbs].overhead_ht_expires = mem; mem_total += mem; mh->num_dbs++; } // Assign the calculated `mem_total` to `mh->overhead_total`. mh->overhead_total = mem_total; // Calculate the memory used for data (`zmalloc_used - mem_total`) and store it in `mh->dataset`. mh->dataset = zmalloc_used - mem_total; mh->peak_perc = (float)zmalloc_used * 100 / mh->peak_allocated; /* Metrics computed after subtracting the startup memory from * the total memory. */ size_t net_usage = 1; if (zmalloc_used > mh->startup_allocated) net_usage = zmalloc_used - mh->startup_allocated; mh->dataset_perc = (float)mh->dataset * 100 / net_usage; mh->bytes_per_key = mh->total_keys ? (net_usage / mh->total_keys) : 0; return mh; }
From the above code analysis, we can understand that used_memory_overhead
consists of the following parts:
server.initial_memory_usage: The memory usage at Redis startup, corresponding to
used_memory_startup
in theINFO
command.mh->repl_backlog: The memory overhead of the replication backlog buffer, corresponding to
mem_replication_backlog
in theINFO
command.mh->clients_slaves: The memory overhead of slave clients, corresponding to
mem_clients_slaves
in theINFO
command.mh->clients_normal: The memory overhead of other clients, corresponding to
mem_clients_normal
in theINFO
command.mh->aof_buffer: The memory overhead of the AOF buffer and AOF rewrite buffer, corresponding to
mem_aof_buffer
in theINFO
command. The AOF buffer is the buffer used before data is written to the AOF file, and the AOF rewrite buffer is used during AOF rewriting to store new data.mh->lua_caches: The memory overhead of Lua script caches, corresponding to
used_memory_scripts
in theINFO
command (introduced in Redis 5.0).Dictionary memory overhead: This part of the memory is not shown in the
INFO
command but can be viewed via theMEMORY STATS
command.
In these memory overheads:
used_memory_startup is generally stable.
mem_replication_backlog is limited by
repl-backlog-size
.used_memory_scripts generally has low overhead.
The dictionary memory overhead grows proportionally to the data size.
Therefore, the main focus should be on three items: mem_clients_slaves, mem_clients_normal, and mem_aof_buffer.
mem_aof_buffer: Pay special attention to the buffer size during AOF rewriting.
mem_clients_slaves and mem_clients_normal: These represent the memory usage of clients. Both are allocated in the same way. The memory overhead of a client mainly includes three parts:
Input buffer: Used to temporarily store client commands, limited by
client-query-buffer-limit
.Output buffer: Used to cache data sent to clients, limited by
client-output-buffer-limit
. If the data exceeds the soft or hard limit for a certain period, the client will be closed.Memory used by the client object itself.
Changes in Memory Statistics in Redis 7
In Redis 7, the following additional memory overhead items are tracked:
mh->cluster_links: The memory overhead of cluster links, corresponding to
mem_cluster_links
in theINFO
command.mh->functions_caches: The memory overhead of function caches, corresponding to
used_memory_functions
in theINFO
command.Memory overhead of key-to-slot mapping in cluster mode, corresponding to
overhead.hashtable.slot-to-keys
in theMEMORY STATS
command.
Additionally, Redis 7 introduced the Multi-Part AOF feature, which removed the AOF rewrite buffer.
It is important to note that the way memory for mh->repl_backlog and mh->clients_slaves is calculated has also changed.
Before Redis 7, mh->repl_backlog accounted for the size of the replication backlog buffer, and mh->clients_slaves accounted for the memory usage of all replica clients.
if (server.repl_backlog) mem += zmalloc_size(server.repl_backlog); mh->repl_backlog = mem; mem_total += mem; mem = 0; // Iterate through all replica clients, accumulating the memory usage of their output buffers, input buffers, and the client object itself. if (listLength(server.slaves)) { listIter li; listNode *ln; listRewind(server.slaves,&li); while((ln = listNext(&li))) { client *c = listNodeValue(ln); mem += getClientOutputBufferMemoryUsage(c); mem += sdsAllocSize(c->querybuf); mem += sizeof(client); } } mh->clients_slaves = mem;
Since each replica is allocated its own replication buffer (i.e., the output buffer corresponding to the replica's client), this approach can lead to memory waste as the number of replicas increases. Moreover, if client-output-buffer-limit
is set too high and there are too many replicas, it may cause the master to run out of memory (OOM).
To address this issue, Redis 7 introduced a global replication buffer. Both the replication backlog buffer (repl-backlog
) and the replica clients' replication buffers now share this buffer.
The replBufBlock
structure is used to store a block of the global replication buffer:
typedef struct replBufBlock { int refcount; /* Number of replicas or repl backlog using. */ long long id; /* The unique incremental number. */ long long repl_offset; /* Start replication offset of the block. */ size_t size, used; char buf[]; } replBufBlock;
Each replBufBlock
contains a refcount
field, which records how many replication instances (including the master's replication backlog and replicas) reference this block.
When a new replica is added, Redis does not allocate a new replication buffer block but instead increments the refcount
of an existing replBufBlock
.
Correspondingly, in Redis 7, the memory calculation for mh->repl_backlog and mh->clients_slaves has also changed:
if (listLength(server.slaves) && (long long)server.repl_buffer_mem > server.repl_backlog_size) { mh->clients_slaves = server.repl_buffer_mem - server.repl_backlog_size; mh->repl_backlog = server.repl_backlog_size; } else { mh->clients_slaves = 0; mh->repl_backlog = server.repl_buffer_mem; } if (server.repl_backlog) { /* The approximate memory of rax tree for indexed blocks. */ mh->repl_backlog += server.repl_backlog->blocks_index->numnodes * sizeof(raxNode) + raxSize(server.repl_backlog->blocks_index) * sizeof(void*); } mem_total += mh->repl_backlog; mem_total += mh->clients_slaves;
Specifically, if the size of the global replication buffer exceeds repl-backlog-size
, the size of the replication backlog buffer (mh->repl_backlog
) is set to repl-backlog-size
, and the remaining portion is considered as memory used by replicas (mh->clients_slaves
). If the size of the global replication buffer is less than or equal to repl-backlog-size
, the entire size of the global replication buffer is assigned.
Additionally, since a Rax tree is introduced to index some nodes in the global replication buffer, the replication backlog also needs to account for the memory overhead of the Rax tree.
Conditions for Triggering Data Eviction
Many people mistakenly believe that data eviction occurs as soon as used_memory
exceeds maxmemory
. In reality, this is not the case.
The following conditions must be met for data to be evicted:
maxmemory must be greater than 0.
maxmemory-policy must not be
noeviction
.The memory usage must meet certain conditions. It's not simply when
used_memory
exceedsmaxmemory
, but whenused_memory
minusmem_not_counted_for_evict
exceedsmaxmemory
.
The value of mem_not_counted_for_evict
can be obtained through the INFO
command, and it is calculated in the freeMemoryGetNotCountedMemory
function.
size_t freeMemoryGetNotCountedMemory(void) { size_t overhead = 0; int slaves = listLength(server.slaves); if (slaves) { listIter li; listNode *ln; listRewind(server.slaves,&li); while((ln = listNext(&li))) { client *slave = listNodeValue(ln); overhead += getClientOutputBufferMemoryUsage(slave); } } if (server.aof_state != AOF_OFF) { overhead += sdsalloc(server.aof_buf)+aofRewriteBufferSize(); } return overhead; }
The freeMemoryGetNotCountedMemory
function calculates the total size of all replica clients' replication buffers, the AOF buffer, and the AOF rewrite buffer.
Therefore, when Redis determines whether data needs to be evicted, it subtracts the memory usage of the replica clients' replication buffers, the AOF buffer, and the AOF rewrite buffer from used_memory
.
Redis Memory Analysis Script
Lastly, here is a script that can help analyze Redis memory usage quickly. By reviewing the output, you can easily see the memory consumption of different parts of Redis and identify which part's memory usage has increased when used_memory
grows.
Script link: https://github.com/slowtech/dba-toolkit/blob/master/redis/redis_mem_usage_analyzer.py
# python3 redis_mem_usage_analyzer.py -host 10.0.1.182 -p 6379 Metric(2024-09-12 04:52:42) Old Value New Value(+3s) Change per second ========================================================================================== Summary --------------------------------------------- used_memory 16.43G 16.44G 1.1M used_memory_dataset 11.93G 11.93G 22.66K used_memory_overhead 4.51G 4.51G 1.08M Overhead(Total) 4.51G 4.51G 1.08M --------------------------------------------- mem_clients_normal 440.57K 440.52K -18.67B mem_clients_slaves 458.41M 461.63M 1.08M mem_replication_backlog 160M 160M 0B mem_aof_buffer 0B 0B 0B used_memory_startup 793.17K 793.17K 0B used_memory_scripts 0B 0B 0B mem_hashtable 3.9G 3.9G 0B Evict & Fragmentation --------------------------------------------- maxmemory 20G 20G 0B mem_not_counted_for_evict 458.45M 461.73M 1.1M mem_counted_for_evict 15.99G 15.99G 2.62K maxmemory_policy volatile-lru volatile-lru used_memory_peak 16.43G 16.44G 1.1M used_memory_rss 16.77G 16.77G 1.32M mem_fragmentation_bytes 345.07M 345.75M 232.88K Others --------------------------------------------- keys 77860000 77860000 0.0 instantaneous_ops_per_sec 8339 8435 lazyfree_pending_objects 0 0 0.0
The script collects Redis memory data at intervals (determined by the -i
parameter, defaulting to 3 seconds). It then compares the newly collected data (New Value
) with the previous data (Old Value
) and calculates the change per second (Change per second
).
The output is divided into four main parts:
Summary: A summary where
used_memory = used_memory_dataset + used_memory_overhead
.Overhead (Total): Shows the memory consumption of individual items in
used_memory_overhead
. TheOverhead (Total)
equals the sum of all items and should theoretically matchused_memory_overhead
.Evict & Fragmentation: Displays key metrics related to eviction and memory fragmentation. Here,
mem_counted_for_evict = used_memory - mem_not_counted_for_evict
. Data eviction only occurs whenmem_counted_for_evict
exceedsmaxmemory
.Others: Other important metrics, including
keys
(the total number of keys),instantaneous_ops_per_sec
(the number of operations per second), andlazyfree_pending_objects
(the number of objects awaiting asynchronous deletion).
If you find that mem_clients_normal
or mem_clients_slaves
is large, you can use the --client
option to check the memory usage of individual clients.
# python3 redis_mem_usage_analyzer.py -host 10.0.1.182 -p 6379 --client ID Address Name Age Command User Qbuf Omem Total Memory ---------------------------------------------------------------------------------------------------- 216 10.0.1.75:37811 721 psync default 0B 232.83M 232.85M 217 10.0.1.22:35057 715 psync default 0B 232.11M 232.13M 453 10.0.0.198:51172 0 client default 26B 0B 60.03K ...
Qbuf: Size of the input buffer.
Omem: Size of the output buffer.
Total Memory: Total memory used by the connection.
Results are sorted by Total Memory in descending order.