Quick answer: Drawing thousands of sprites with a for blit loop in Pygame slowing to a crawl? Each blit call crosses Python ↔ C; batch via surface.blits() for a single call.
A bullet-hell game with 2000 bullets per frame drops to 20 FPS. The render loop calls screen.blit(b.image, b.rect) per bullet.
Batch with blits()
screen.blits([(s.image, s.rect) for s in bullets])Single API call for the whole batch. Pygame iterates internally in C; far less Python overhead.
Special Blit Flags
The list elements can be (source, dest) or (source, dest, area, blend_flags). Mix and match per blit.
Pre-Compute Rects
If sprite rects update each frame, compute them in a tight loop before the blits call. Avoids attribute lookups during the inner C loop.
Sprite Groups
pygame.sprite.Group.draw() already uses blits internally. Use groups for managed sprite collections — you get the batching for free.
Verifying
FPS recovers significantly on high sprite counts. Per-frame Python time drops; render time is GPU- or memcpy-bound, not call overhead.
“Many blits = many Python calls. Use blits() to batch.”
If you outgrow blits() too, look at moderngl — instanced draw calls beat any pure CPU blit at scale.