Quick answer: Reduce draw calls by batching objects that share materials, using GPU instancing for repeated geometry, consolidating textures into atlases, and sorting render order to minimize state changes. Profile first to identify where draw call overhead is actually hurting your frame rate.
You’re hitting 30 FPS in a scene that doesn’t look particularly demanding. The GPU profiler shows plenty of headroom, but the CPU is maxed out. The culprit is almost certainly draw call overhead—the CPU is spending more time telling the GPU what to render than the GPU spends actually rendering it. Draw call optimization is one of the highest-impact performance improvements you can make, especially on mobile platforms and older hardware.
Understanding Draw Call Overhead
Every time your game tells the GPU to render something, it issues a draw call. Each draw call involves binding vertex and index buffers, setting shader uniforms, switching textures and materials, and validating the pipeline state. On older graphics APIs like OpenGL and DirectX 11, this overhead is substantial because the driver must validate the entire state on every call.
The cost isn’t in the GPU doing the work—modern GPUs can process millions of triangles per frame without breaking a sweat. The cost is in the CPU-side preparation and driver overhead. A scene with 5,000 draw calls might spend 8-10ms just in draw call submission, leaving only 6ms for everything else at a 60 FPS target. The same geometry rendered in 500 draw calls through batching might take only 1ms to submit.
To identify your current draw call count, use your engine’s built-in statistics. In Unity, enable the Stats overlay in the Game view or use the Frame Debugger. In Unreal, use stat scenerendering in the console. In Godot, enable the Performance monitor from the Debugger panel. Aim to understand which objects, materials, and systems contribute the most draw calls before optimizing.
Batching Techniques
Static batching combines meshes that share a material into a single large mesh at build time. This eliminates draw calls for those objects entirely—instead of 100 separate draw calls for 100 crates, the engine submits one draw call for a single combined mesh. The trade-off is increased memory usage because the combined mesh stores duplicated vertex data, and the objects cannot move at runtime.
In Unity, enable Static Batching in Player Settings and mark objects as “Batching Static” in the Inspector. In Unreal, use the Merge Actors tool to combine static meshes. In Godot, use the MeshInstance3D merge feature or combine meshes in your 3D modeling tool before import.
Dynamic batching combines small moving meshes at runtime. It’s less effective than static batching because the CPU must re-combine meshes every frame, but it helps for small objects like bullets, particles, and UI elements. Most engines limit dynamic batching to meshes under a certain vertex count (usually 300 vertices in Unity) because the CPU cost of combining larger meshes exceeds the draw call savings.
SRP Batcher (Unity-specific) takes a different approach: instead of combining meshes, it batches the bind and draw commands for objects using the same shader, even if they have different material properties. This dramatically reduces draw call overhead for scenes with many materials that share the same shader. Enable it in your URP or HDRP asset settings.
GPU Instancing
GPU instancing is the most efficient technique for rendering many copies of the same mesh. Instead of submitting a separate draw call for each tree in a forest, instancing submits a single draw call with a buffer of per-instance data (position, rotation, scale, color). The GPU reads the instance buffer and renders all copies in one pass.
In Unity, enable “Enable GPU Instancing” on your material. For custom shaders, use the instancing macros:
// In your shader
UNITY_INSTANCING_BUFFER_START(Props)
UNITY_DEFINE_INSTANCED_PROP(float4, _Color)
UNITY_INSTANCING_BUFFER_END(Props)
void surf(Input IN, inout SurfaceOutput o) {
o.Albedo = UNITY_ACCESS_INSTANCED_PROP(Props, _Color).rgb;
}
In Unreal, instancing is handled automatically through Instanced Static Meshes (ISM) and Hierarchical Instanced Static Meshes (HISM). HISM adds LOD and occlusion culling on top of instancing, making it ideal for vegetation and large prop populations. Foliage placed with the Foliage tool uses HISM by default.
In Godot, use MultiMeshInstance3D for instanced rendering. Create a MultiMesh resource, set the instance count, and assign per-instance transforms:
# Godot - MultiMesh setup for instanced rendering
var multimesh = MultiMesh.new()
multimesh.mesh = preload("res://meshes/tree.tres")
multimesh.transform_format = MultiMesh.TRANSFORM_3D
multimesh.instance_count = 1000
for i in range(1000):
var transform = Transform3D()
transform.origin = Vector3(randf() * 100, 0, randf() * 100)
multimesh.set_instance_transform(i, transform)
$MultiMeshInstance3D.multimesh = multimesh
Instancing works best when you have many copies of identical geometry. If every object in your scene is unique, instancing won’t help—focus on batching and material consolidation instead.
Texture Atlasing and Material Consolidation
Draw calls can only be batched when objects share the same material. If every prop in your environment uses a unique texture, that’s a unique material and a unique draw call per prop. Texture atlasing solves this by packing multiple textures into a single large texture and adjusting UV coordinates to reference the correct region.
For 2D games, sprite atlases are standard practice. Unity’s Sprite Atlas, Godot’s AtlasTexture, and texture packing tools like TexturePacker combine individual sprites into atlas sheets. This can reduce a 2D scene from hundreds of draw calls to a handful.
For 3D games, create material atlases by combining diffuse, normal, and roughness textures for similar objects into shared atlas textures. Props that appear in the same environment—crates, barrels, fences, rocks—can share a single atlas material. The texture artist packs all their textures into one 2048x2048 or 4096x4096 atlas, and each mesh’s UVs reference its region of the atlas.
Beyond atlasing, minimize the number of distinct shaders in your project. Each unique shader-material combination creates a new rendering state that prevents batching. Use material instances (Unreal) or material property blocks (Unity) to vary appearance without creating new materials. Ten objects using the same shader with different tint colors can be batched; ten objects using ten different shaders cannot.
Sorting and State Change Reduction
Even when you can’t batch draw calls, you can reduce their cost by sorting objects to minimize state changes between calls. Switching shaders is expensive. Switching textures is moderate. Switching uniform values is cheap. Sort your draw calls so that objects using the same shader are rendered consecutively, then within each shader group, sort by texture, then by other state.
Most engines handle render sorting automatically, but you can help by structuring your materials to minimize permutations. Avoid unnecessary shader keywords or variants. In Unity, check how many shader variants your project generates with the “Shader Variant Collection” tool—a shader with 10 keywords can generate 1,024 variants, each requiring a separate state setup.
For UI rendering, draw call overhead is often the primary bottleneck. Every UI element with a different texture or font generates a separate draw call. Use sprite atlases for UI icons, minimize the number of distinct fonts, and avoid interleaving text and image elements when possible. A UI canvas with 500 draw calls is common in complex menus and is almost always avoidable with proper atlasing and layout optimization.
Finally, remember that draw call optimization has diminishing returns. If your profiler shows draw call submission taking 2ms of a 16ms frame budget, a 50% reduction saves only 1ms. Spend your optimization time where it makes the biggest difference—and always profile before and after to confirm your changes actually helped.
Fewer draw calls mean less CPU time talking to the GPU and more time for actual gameplay logic.