Quick answer: Log the ratio of largest free block to total free memory over a long play session. A dropping ratio confirms fragmentation. Find the allocation hotspots with a heap profiler, and replace per-frame allocations with pooled objects or arena allocators that reset each frame.
A player reports your game crashes after three hours. You reproduce it, check task manager, and the process has 500 MB of free memory but the next allocation fails. Total memory use is nowhere near the system limit. The allocator’s free list is a disaster — thousands of tiny gaps, no contiguous block big enough for a 4 MB texture. This is memory fragmentation, and it is the most annoying class of memory bug because your code looks fine.
What Fragmentation Is
When you allocate and free memory in random order over a long time, the allocator’s free list becomes a patchwork. Consider a simple sequence:
Allocate A (1 MB) | A | |
Allocate B (1 MB) | A | B | |
Allocate C (1 MB) | A | B | C | |
Free B | A | | C | | <- 1 MB gap in the middle
Allocate D (2 MB) | A | | C | D | <- D cannot fit in the gap, goes to the end
After many cycles, you have lots of little gaps and very few big ones. A new 4 MB allocation fails not because there is no free memory but because there is no contiguous 4 MB free.
The Symptom
- The game runs fine for the first 30–60 minutes and then out-of-memory crashes.
- Total process memory is stable or slowly rising, not spiking.
- Restarting the game fixes it temporarily.
- The crash stack shows a specific allocation (often a large texture or audio buffer) failing.
- The same build runs fine on machines with more RAM.
Step 1: Measure Fragmentation
Instrument your allocator to log two values every 30 seconds:
total_free_bytes: the sum of all free blocks.largest_free_block: the biggest single contiguous free region.
The ratio largest_free_block / total_free_bytes is your fragmentation score. At the start of a fresh run, it is close to 1.0 (one big free block). Over time, it drops. When it falls below 0.1, large allocations start failing even though total free is still plenty.
// C++ with a custom allocator that exposes free list stats
void LogMemoryStats()
{
auto stats = MyAllocator::GetStats();
float ratio = float(stats.largestFreeBlock) / float(stats.totalFree);
UE_LOG(LogMemory, Log, "total_free=%lu largest=%lu ratio=%.3f",
stats.totalFree, stats.largestFreeBlock, ratio);
}
Plot the ratio over a 60-minute play session. If it drops, you have fragmentation. If it stays flat, your bug is elsewhere.
Step 2: Find the Hotspots
Fragmentation is caused by specific allocation patterns, not by the total allocation rate. Use a heap profiler to find call sites that allocate and free frequently.
- Tracy Profiler (C++): per-allocation tracking with zero overhead in release.
- Unity Memory Profiler: snapshot the heap over time and compare.
- Valgrind Massif (Linux): heap peak and allocation tree.
- Unreal Insights Memory: track per-tag allocations in real time.
Look for tags or call sites that produce many allocations of varied sizes — these are the ones shredding your free list.
Step 3: Fix with Pooling
The cheapest fix is to reuse objects instead of allocating and freeing them. A pool pre-allocates a fixed number of objects and hands them out on demand.
template<typename T>
class ObjectPool
{
std::vector<T*> _free;
public:
T* Acquire()
{
if (!_free.empty())
{
T* obj = _free.back();
_free.pop_back();
return obj;
}
return new T();
}
void Release(T* obj)
{
obj->Reset();
_free.push_back(obj);
}
};
// Use for frequently-spawned things
static ObjectPool<Bullet> bulletPool;
void FireWeapon()
{
Bullet* b = bulletPool.Acquire();
b->Initialize(position, velocity);
}
Apply pooling to bullets, particles, enemies, UI notifications, audio sources — anything that has a high allocation rate. The pool grows until it matches your peak usage, then stays flat.
Step 4: Arena Allocators for Temporary Work
Per-frame temporary allocations (string formatting, small containers, intermediate math) should not go through the main heap. Use an arena (or linear) allocator: allocate a large fixed buffer at startup, bump a pointer forward for each allocation, and reset the pointer to the start at the end of the frame.
Arena resets are O(1) and leave zero fragmentation. Everything allocated in the arena is implicitly freed at reset time.
class FrameArena
{
char* _base;
size_t _offset;
size_t _size;
public:
void* Alloc(size_t bytes, size_t align = 16)
{
size_t aligned = (_offset + align - 1) & ~(align - 1);
if (aligned + bytes > _size) return nullptr;
void* p = _base + aligned;
_offset = aligned + bytes;
return p;
}
void Reset() { _offset = 0; }
};
Step 5: Segregate Long-Lived from Short-Lived
The worst fragmentation happens when long-lived allocations are interleaved with short-lived ones. A long-lived object sits in the middle of the heap forever, blocking the free regions around it from coalescing.
Use separate heaps or regions for different lifetimes:
- Level data: allocated once when the level loads, freed when the level unloads. Lives in its own arena.
- Persistent objects: player, UI, main game systems. Allocated once at boot, freed at shutdown. Lives in the main heap.
- Per-frame work: goes in the frame arena.
- Frequently-pooled objects: bullets, particles, etc. Live in pools.
Each region is homogeneous in lifetime, so fragmentation never gets bad.
Verifying the Fix
Run a two-hour play session with the fragmentation logger enabled. Compare the ratio graph before and after. The post-fix graph should stay flat near 1.0 instead of dropping toward 0.
“Memory fragmentation is a patience problem. It does not show up in your half-hour playtest, and it does show up in the player’s all-night marathon. Instrument early, fix early, sleep well.”
Related Issues
For general memory leak debugging, see memory leak detection in game development. For scene transition spikes, see how to debug memory spikes during scene transitions. For catching the leaks in QA, see how to catch memory leaks during QA testing.
A two-hour soak test every build catches fragmentation before it reaches players. It is the one “slow test” that is genuinely worth running unattended.