Why is pygame.mask so slow compared to rect collision?

Mask overlap walks every pixel of the overlapping region. A 64x64 sprite has 4096 pixel checks per pair. Rect collision is one integer comparison. Using mask on every pair is O(N*M*pixels); it blows up fast.

Should I always precompute pygame masks?

Yes. pygame.mask.from_surface is expensive - call it once per unique image, not per instance or frame. Cache masks in a dict keyed on the image reference and reuse them across all sprites using that image.

How do I combine rect and mask collision?

Use pygame.sprite.spritecollide with collide_rect first to get candidates, then run mask_overlap on just those. This broad-phase / narrow-phase pattern skips 95% of mask work. Pygame's spritecollide even takes a collided= callback where you can chain checks.

Fix: Pygame Mask Pixel-Perfect Collision Slow at Scale

Quick answer: Cache masks per-image (not per-sprite, never per-frame), broad-phase with rect collision first, and only run collide_mask on pairs that pass the rect test. 100 bullets vs 50 enemies goes from 15 FPS to 60.

You want pixel-perfect bullet-to-enemy collision in your bullet hell. You use pygame.sprite.groupcollide with collide_mask. It works for 20 bullets. At 200, FPS drops to single digits. Mask collision is pixel-accurate but expensive; without a broad phase, it’s O(N×M×pixels) and scales terribly.

Mask Cost

Each mask overlap walks every pixel in the overlap region. Two 64×64 sprites overlapping fully = 4096 pixel tests per pair. 200 bullets vs 50 enemies = 10,000 pairs. Before rect-filtering, that’s 40 million pixel tests per frame. You cannot hit 60 FPS at that rate.

Cache Masks

_mask_cache = {}

def get_mask(image):
    if image not in _mask_cache:
        _mask_cache[image] = pygame.mask.from_surface(image)
    return _mask_cache[image]

pygame.mask.from_surface is not free. Call it once per unique image asset and cache. All sprites sharing an image share a mask reference.

Broad-Phase First

def collide_rect_then_mask(a, b):
    if not a.rect.colliderect(b.rect):
        return False
    offset = (b.rect.x - a.rect.x, b.rect.y - a.rect.y)
    return a.mask.overlap(b.mask, offset) is not None

# Use as collided= callback
hits = pygame.sprite.groupcollide(bullets, enemies, True, False,
                                collided=collide_rect_then_mask)

The rect test rules out 90%+ of pairs instantly. Only pairs with overlapping rects pay the mask cost.

When You Don't Need Mask

If your sprites are rectangular (crates, tiles, most platformer characters), skip mask entirely. For circular sprites, collide_circle is faster. Use mask only when sprite shapes are genuinely irregular (bullets with trails, custom hitboxes, destructible geometry).

Spatial Partitioning

For 1000+ sprites, even rect-vs-rect becomes expensive. Add a uniform grid or quadtree. Pygame has no built-in, but a simple dict of (grid_x, grid_y) -> [sprites] turns collision from O(N²) to near O(N).

Verifying

Use pygame.time.Clock to measure per-frame collision time. With the optimization, total mask cost should be under 1 ms regardless of sprite count. Anything higher and the broad phase is broken.

Understanding the issue

This bug class falls into a pattern that's worth understanding beyond the specific case. In Pygame, the underlying behavior is shaped by how the engine layers its abstractions - the public API you call, the runtime systems that respond, and the platform-specific implementations underneath. A bug at any layer can produce symptoms that look like they originate at a different layer. Triaging effectively means recognizing which layer the symptom belongs to, even when the gameplay code is what's visible.

The specific bug described above is the kind that surfaces during integration rather than unit testing. It depends on a combination of factors: the asset configuration, the runtime state, the platform's specific behavior. In isolation, each piece looks correct; in combination, the bug emerges. This is why thorough integration testing - playing the actual game in realistic conditions - catches things that automated tests miss.

Why this happens

Bugs of this class are particularly easy to ship past internal QA because they often depend on specific runtime conditions - hardware combinations, network states, or asset configurations that QA didn't reproduce. Players hit them in the wild, file reports that are hard to repro, and the bug accumulates negative reviews while engineering tries to recreate the failure mode.

At the engine level, the behavior comes from a deliberate design decision in Pygame. The engine team chose a particular trade-off - usually performance versus convenience, or generality versus specificity - and that trade-off has consequences when you push against it. Understanding the trade-off is what turns 'this bug is mysterious' into 'this bug is the expected consequence of this design'.

Verifying the fix

After applying the fix, the verification step has three parts: confirm the original repro is resolved, confirm no obvious regressions in adjacent functionality, and (for shipping titles) deploy to a small player cohort first and watch the crash and report rates. Each step catches something the others miss.

Reproducibility is the prerequisite for verification. If you can't reliably reproduce the bug pre-fix, you can't reliably verify it post-fix. Spend time getting a clean reproduction before you write any fix code. The fix is fast once you understand the reproduction; the reproduction is the slow part.

Variations to watch for

Related bug classes often share the same root cause. If you find yourself fixing this issue, look for cousins: similar symptoms in adjacent systems, the same data flow but a different value, or the same fix pattern in another module. The catalog of 'we've seen this before' becomes valuable institutional knowledge.

Adjacent bugs often share a root cause. After fixing the case you've found, spend an hour searching the codebase for similar patterns. What's the same call with different arguments? The same data flow with a different entity type? The same lifecycle issue in a sibling system? Each match is a candidate for the same fix, or a related fix that prevents future bugs of the same class.

In production

For shipping titles with a long support window, watch for this issue resurfacing after dependency updates. Engine upgrades, driver updates, OS releases - each one can resurface a bug class you thought you'd fixed because the underlying behavior changed slightly. Regression tests catch the obvious ones; player reports catch the rest.

When triaging a similar issue in production, prioritize gathering data over hypothesizing causes. A player report describes a symptom; what you need is a build SHA, a session timestamp, and ideally a screen recording or session replay. With those, the bug becomes tractable. Without them, you're guessing at hypothetical reproductions that may not match what the player actually hit.

Performance considerations

If this issue manifests under high load (many actors, many particles, many network connections), profile the post-fix code path with realistic counts. The original cost was a bug; the new cost is real work, and real work has a budget.

Diagnostic approach

The diagnostic tools available depend on your engine and platform. Use the engine's native profilers and debug overlays before reaching for external tools. The native tools have context that external tools lack - they know which subsystem owns the code, which assets are loaded, and what state the engine is in.

For Pygame-specific diagnostics, the editor's profiler is the canonical starting point. Capture a representative frame with the symptom present; compare against a frame without the symptom; the diff often points directly at the cause. If the symptom is non-deterministic, capture multiple frames and look for the pattern - the cause is usually a state transition or a specific input value rather than a continuous effect.

Tooling and ecosystem

Modern engine versions ship better tooling for this kind of issue than older versions. If you're on an older release, the diagnostic step may take significantly longer because the tools you'd want don't exist yet. Sometimes the right answer is upgrading rather than fighting through limited tooling.

Within Pygame, the relevant diagnostic surfaces include the standard frame debugger, memory profiler, and engine-specific debug overlays. Each one shows a different facet of what's happening. The frame debugger reveals draw call ordering and state transitions; the memory profiler shows allocation patterns; the debug overlay reveals per-system state. Bugs that resist one tool usually surrender to another - the trick is knowing which tool to reach for first.

Edge cases and pitfalls

Edge cases for this class of issue often involve specific timing: the first frame after a state change, the last frame before a transition, frames where multiple subsystems update simultaneously. Reproducing these reliably is part of what makes the bug class hard to test.

When writing a regression test for this fix, focus on the boundary conditions that surfaced the original bug. Tests that exercise the happy path catch obvious regressions; tests that exercise the boundary catch the subtler regressions that look like new bugs but are really the original returning. The latter are the tests that earn their keep over the long life of the project.

Team communication

When this bug class affects multiple teams (often the case for cross-system issues), early communication prevents duplicate work. The team that owns the symptom may not own the cause. A 15-minute conversation at the start of triage often saves hours of independent investigation.

If this fix touches a system several engineers work in, a short writeup in the team's engineering channel helps. Not a full design doc - a paragraph explaining what was wrong, what's fixed, and what to watch for. Future engineers encountering similar symptoms will search for the fix; making it findable is a small investment that pays back later.

“Pixel-perfect collision isn’t expensive — running it on every pair is. Filter cheaply, refine expensively.”

Related Issues

For broader Pygame performance, see Pygame performance tips. For simpler rect collision bugs, see Pygame sprite collision not detected between groups.

Print the cache hit rate for masks during dev. If you’re building new masks every frame, you’re shipping a 15 FPS game.