Quick answer: Report crash-free session rate per release, focus the top 10 crash signatures each week, separate regressions from long-standing issues, and track fix velocity so the team knows whether the pipeline is healthy. Set an explicit target (99.5 percent crash-free sessions) and defend it across releases.

Most game teams treat crashes as individual bugs to squash. This works until your game ships, at which point the pile of unfixed crashes grows faster than you can keep up, and you lose track of whether things are getting better or worse. A crash rate process treats crashes as a portfolio: a small number of high-impact signatures that you fix aggressively, plus a long tail that you triage by impact rather than count. Done right, you can hold a crash-free rate above 99.5 percent across a year of releases without burning the team out on intermittent single-user bugs.

The Two Numbers That Matter

Report crash rate two ways, not one.

The first is the crash-free sessions rate: of all sessions that started in the last seven days, what percentage ended without a crash? This is the player-facing number. A player cares about whether their own play session survived, so session-level crash-free rate is what correlates with reviews and retention.

The second is the crashes per thousand hours: how many crashes happened in the last seven days divided by total playtime in thousands of hours. This is the engineering-facing number. It captures the cost of long play sessions (which one crash-free rate hides) and is more sensitive to slow regressions.

SELECT
    release_version,
    COUNT(DISTINCT session_id) AS sessions,
    COUNT(DISTINCT CASE WHEN crashed THEN session_id END)
        AS crashed_sessions,
    1.0 - (crashed_sessions / sessions)   AS crash_free_rate,
    SUM(crash_count) / (SUM(play_seconds) / 3600.0 / 1000)
        AS crashes_per_k_hour
FROM sessions
WHERE started_at > NOW() - INTERVAL 7 DAY
GROUP BY release_version;

Show both numbers on the same dashboard with per-release breakdowns. If crash-free rate is flat but crashes-per-thousand-hours is rising, you have a new crash that hits late-game content but doesn’t catch the short sessions.

Fingerprint and Group

Crash signatures are more useful than individual crashes. Two players who crash with the same stack trace almost certainly have the same bug. Group crashes by a signature derived from the top 3–5 stack frames (normalized to remove line numbers and inlining artifacts). Most crash reporters do this for you; if yours doesn’t, do it in a cron job.

A good signature is stable across releases if the underlying bug is stable. A fix in the module at frame 2 should change the signature; a change in frame 7 that happens to be in unrelated refactored code should not. Tune your frame depth and symbol normalization accordingly.

Focus the Top 10

Every week, list the top 10 crash signatures sorted by affected unique sessions over the last 7 days. This is the entire work queue. Anything below 10 goes into a “long tail” bucket that you check monthly but do not actively chase.

Why only 10? Because in most games the top 10 signatures account for 70–80 percent of all crashed sessions. Fixing a signature that affects 5,000 sessions improves the player experience more than fixing twenty signatures that each affect 50. Work on the big ones first, always.

def top_crash_signatures(days=7, limit=10):
    return db.query("""
        SELECT fingerprint,
               first_seen_version,
               top_frame,
               COUNT(DISTINCT session_id) AS affected,
               COUNT(DISTINCT user_id) AS users
        FROM crashes
        WHERE occurred_at > NOW() - INTERVAL %s DAY
        GROUP BY fingerprint
        ORDER BY affected DESC
        LIMIT %s
    """, (days, limit))

Separate Regressions From Long-Standing Bugs

A crash signature that first appeared in the current release is a regression and deserves immediate attention — it means something you changed this release broke something that used to work. A signature that has existed for six releases but is spiking now is a rate change, usually because the affected content got more play (maybe you highlighted it in a livestream) rather than because the code got worse.

Tag every signature with first_seen_version and last_fixed_version. Regressions get a “block release” label; long-standing spikes get an “investigate context” label. The workflows for each are different and the prioritization is different.

Track Fix Velocity

Fix velocity is the number of signatures that moved from open to fixed-and-verified per week. It is a leading indicator: when velocity drops, crash rate follows two releases later. Publish this number on the same dashboard as crash rate and have a team conversation whenever it drops for three weeks in a row.

Fix velocity also exposes whether the triage process works. A low velocity with a long backlog means signatures are going in but not out — usually because they need reproductions that nobody is getting. A high velocity with a growing backlog means the pipeline is healthy but the team is outnumbered and needs more engineers or more tooling.

Publish a Release Quality Bar

Define what “ready to ship” means in numbers. A typical bar:

When a release fails the bar, the decision is explicit: delay, or accept the debt. Either is fine — what’s bad is releasing without knowing you missed the bar and then being surprised by the review fallout.

Retrospect After Every Release

A week after release, run a short retro: did the crash rate hold, which signatures were new, which were regressions, which were promptly fixed, and what process improvement would have caught the regressions earlier. Write it down and circulate it. Over several releases you build an institutional memory of what kinds of changes produce which kinds of crashes, and that memory is more valuable than any single fix.

“We started publishing crash-free rate and fix velocity to the whole team eighteen months ago. The first month was embarrassing — we had a backlog of 80 signatures and velocity of 3 per week. Eighteen months later we hold 99.7 percent crash-free rate across four releases a year, and the team actually competes to knock signatures out of the top 10.”

Related Issues

For the upload pipeline behind these numbers, see how to build a crash dump automatic upload system. For automated regression detection, read how to test game performance regression in CI.

Post your top 10 crash signatures on a shared dashboard this week. The first time someone from marketing or design asks about a number, you’ll know the dashboard is earning its keep.