Quick answer: Call convert_alpha() on every loaded image, use pygame.sprite.Group with RenderUpdates for dirty rect rendering, cache transformed surfaces instead of recomputing them, and switch to pygame-ce if you are not already using it. These four changes can triple your frame rate on most Pygame projects.

Pygame gets a reputation for being slow. It is not — but it is unforgiving, and it punishes you for patterns that are fine in other engines. The good news is that Pygame performance is deterministic: the same three or four mistakes account for 90% of the slowdowns, and fixing them is usually mechanical. Here is the full list of optimizations that every Pygame indie should know before shipping.

The Biggest Win: Call convert() on Everything

When you load a PNG with pygame.image.load("sprite.png"), the resulting Surface is in the file's native pixel format, which is almost never the same as your display's format. Every time you blit that surface, Pygame has to convert it on the fly, pixel by pixel, which is staggeringly slow.

Call .convert() (for opaque images) or .convert_alpha() (for images with transparency) immediately after loading. This bakes the surface into the display format once, and every subsequent blit is a fast memcpy.

import pygame

pygame.init()
screen = pygame.display.set_mode((1280, 720))

# SLOW: untranslated format, converted every blit
# player = pygame.image.load("player.png")

# FAST: converted once at load time
player = pygame.image.load("player.png").convert_alpha()
background = pygame.image.load("bg.png").convert()

Do this for every image. The performance gain is not subtle — it can be 5–20x, especially on older hardware. If you only take away one thing from this article, take away this.

Cache Transformed Surfaces

Calling pygame.transform.rotate, pygame.transform.scale, or pygame.transform.flip every frame is a CPU-intensive operation. Pygame has to allocate a new surface, iterate every source pixel, and compute its destination. If you are rotating a sprite to face the player every frame, you are burning cycles recomputing the same rotation over and over.

Pre-rotate into a lookup table at startup:

class RotatableSprite(pygame.sprite.Sprite):
    _cache = {}  # class-level cache shared across instances

    def __init__(self, image_path):
        super().__init__()
        self.original = pygame.image.load(image_path).convert_alpha()
        # Pre-rotate into 36 slots (every 10 degrees)
        if image_path not in self._cache:
            self._cache[image_path] = [
                pygame.transform.rotate(self.original, angle)
                for angle in range(0, 360, 10)
            ]
        self.rotations = self._cache[image_path]
        self.image = self.rotations[0]
        self.rect = self.image.get_rect()

    def set_angle(self, degrees):
        index = int((degrees % 360) / 10)
        self.image = self.rotations[index]
        self.rect = self.image.get_rect(center=self.rect.center)

Quantizing to 10-degree steps is imperceptible for most games and turns a per-frame operation into a lookup table access.

Dirty Rect Rendering

Pygame's default render loop is:

screen.fill((0, 0, 0))
all_sprites.draw(screen)
pygame.display.flip()

Every frame, you clear the entire screen and redraw everything. This is simple but wasteful: if only a few sprites moved, you are repainting millions of pixels that did not change. Dirty rect rendering only repaints the regions that changed.

# Use RenderUpdates group which tracks dirty rects
sprites = pygame.sprite.RenderUpdates()

def game_loop():
    clock = pygame.time.Clock()
    background = pygame.image.load("bg.png").convert()
    screen.blit(background, (0, 0))
    pygame.display.flip()

    while running:
        # Erase sprites from background
        sprites.clear(screen, background)
        # Update sprite positions
        sprites.update()
        # Draw sprites and get dirty rects
        dirty_rects = sprites.draw(screen)
        # Only update the rects that changed
        pygame.display.update(dirty_rects)
        clock.tick(60)

This is dramatically faster for games where only a fraction of the screen changes per frame (turn-based games, puzzles, management sims). For fast action games where everything scrolls, stick with the full-flip approach because the dirty region is the whole screen anyway.

Use Sprite Groups, Not Lists

pygame.sprite.Group is not just a convenience wrapper. It has built-in optimizations for draw ordering, collision detection, and kill handling. A plain Python list of sprites with a manual for-loop is 2–3x slower for typical use cases.

The even faster version is pygame.sprite.LayeredDirty, which combines dirty rect rendering with layered drawing:

sprites = pygame.sprite.LayeredDirty()

class Background(pygame.sprite.DirtySprite):
    def __init__(self):
        super().__init__()
        self.image = pygame.image.load("bg.png").convert()
        self.rect = self.image.get_rect()
        self.dirty = 2  # always redraw

# Background drawn first (layer 0), player on top (layer 10)
sprites.add(Background(), layer=0)
sprites.add(player, layer=10)

Avoid Per-Pixel Python

Iterating over pixels in Python is a performance disaster. Even a single for x in range(width): for y in range(height): loop on a 320x240 image is visibly slow because the Python interpreter pays 100–200 ns per iteration.

Use pygame.surfarray to access pixels as a NumPy array, then operate on the array in bulk:

import pygame
import numpy as np

arr = pygame.surfarray.array3d(sprite)  # (width, height, 3)
# Tint everything red
arr[:, :, 1] = 0  # zero green
arr[:, :, 2] = 0  # zero blue
tinted = pygame.surfarray.make_surface(arr)
tinted = tinted.convert_alpha()

NumPy's vectorized operations run at C speed. This pattern is 100–1000x faster than a Python loop.

Profile Before You Optimize

Every optimization above costs you something: code complexity, memory, or flexibility. Do not apply them blindly. Instead, profile to find your actual bottleneck first.

import cProfile
import pstats

profiler = cProfile.Profile()
profiler.enable()

run_game()

profiler.disable()
stats = pstats.Stats(profiler).sort_stats("cumulative")
stats.print_stats(20)

Run this for 10 seconds of gameplay and look at the top 20 functions. If pygame.blit dominates, you have a blit problem (convert surfaces). If transform.rotate dominates, cache your rotations. If collide_rect dominates, you have too many sprites in your collision group — partition them by zone.

When to Escape Pygame

Pygame is a 2D-on-CPU library. If you need thousands of dynamic particles, complex shaders, or scroll-heavy parallax at 144 FPS, you are fighting the library rather than using it. The easy escape hatch is pygame-ce's new pygame.Window API with hardware acceleration. The harder escape hatch is switching to moderngl for GPU-accelerated 2D, or to Godot if you want proper tooling.

"Pygame is fast enough for any 2D indie game if you write it the Pygame way. Call convert, cache transforms, use sprite groups, and you will never think about performance again."

Related Issues

For music-specific performance and looping issues see Pygame music not looping correctly. For collision system pitfalls that can tank frame rate, read Pygame sprite collision not detected between groups.

convert_alpha. Every image. Every time. Set a hook to remind you.