Quick answer: Instrument your game to send session-start and session-end events, then compute crash-free session rate as the percentage of sessions that end cleanly. Track this metric per game version, compare it against the previous release, and alert your team when a new version regresses beyond a defined threshold.
A game that crashes loses players. Worse, a game that crashes more after an update loses trust. Players expect each patch to be more stable than the last, and when it isn’t, the negative reviews appear within hours. Tracking crash-free session rate across versions gives you a single number that answers the most important stability question: is this build better or worse than the last one? Without this metric, you are flying blind — relying on player reports that represent a fraction of actual crashes and provide no comparative data between releases.
Defining a Session
Before you can count crashes, you need a precise definition of what a session is. The simplest and most reliable approach is process-lifetime sessions: a session starts when the game process launches and ends when it shuts down. Your game sends a session_start event to your telemetry backend at launch and a session_end event during the clean shutdown sequence.
If a session has a start event but no corresponding end event within a reasonable timeout window — typically 24 hours — the session is classified as crashed. The game process terminated abnormally before it could send the shutdown event. This approach catches hard crashes, freezes that force the player to kill the process, and operating system terminations due to resource exhaustion.
Include a unique session ID, the game version string, the platform, and a timestamp in both events. The session ID ties together all telemetry and crash reports from that play session. The version string is what allows you to compare stability across releases. Use semantic versioning or build numbers — whatever scheme your team already uses — but be consistent.
// Session start - called in the game's initialization
func start_session():
var session = {
"session_id": generate_uuid(),
"event": "session_start",
"version": get_game_version(), // e.g. "1.4.2"
"platform": get_platform(), // e.g. "windows", "linux", "macos"
"timestamp": unix_time_ms()
}
send_telemetry(session)
store_local("current_session", session)
// Session end - called during clean shutdown
func end_session():
var session = load_local("current_session")
send_telemetry({
"session_id": session.session_id,
"event": "session_end",
"duration_ms": unix_time_ms() - session.timestamp,
"timestamp": unix_time_ms()
})
Some teams prefer activity-based sessions, where a new session begins after 30 minutes of inactivity. This is common in mobile games where the process may stay alive in the background for hours. Choose the model that matches how your players actually play, but document it clearly so everyone on your team interprets the metric the same way.
Computing Crash-Free Session Rate
The formula is straightforward. Take the total number of sessions for a given version in a given time window. Subtract the number of sessions that were classified as crashed. Divide by the total and multiply by 100 to get a percentage.
A healthy game targets a crash-free rate of 99.5% or higher. Below 99%, players are noticing. Below 98%, you have a serious stability problem. These thresholds vary by platform — mobile games tend to crash more due to memory pressure and OS-level kills, so 99% might be an acceptable mobile target while 99.7% is expected on PC.
Calculate the metric daily, but also compute it over rolling 7-day and 30-day windows. Daily rates are noisy because a single bad hour can swing the number. The 7-day rate smooths out daily variance while still responding quickly to regressions. The 30-day rate gives you the long-term stability trend.
Always segment by version. The overall crash-free rate across all versions is misleading because it blends the stability of your current release with legacy versions that may still have active players. What you need to know is whether version 1.4.2 is more stable than version 1.4.1, and the only way to answer that is to compute the rate for each version independently.
Detecting Regressions Across Versions
Manual monitoring does not scale. You need automated regression detection that compares each new version against the previous one and alerts your team when stability degrades. The simplest approach is a threshold-based alert: if the crash-free rate of the new version drops more than 0.5 percentage points below the previous version during the first 48 hours after release, fire an alert.
The 48-hour window matters because early adopters tend to exercise the game more aggressively than the average player, and because you want to catch problems before the majority of your player base updates. If version 1.4.1 had a 99.6% crash-free rate in its first 48 hours and version 1.4.2 drops to 98.9% in the same window, something went wrong in the new release.
More sophisticated regression detection uses statistical comparison rather than fixed thresholds. Compute a confidence interval for the previous version’s crash-free rate and check whether the new version falls outside it. This accounts for the natural variance in crash rates and avoids false positives when the previous version happened to have an unusually good first 48 hours.
Segment regressions by platform. A crash that only affects Linux players will barely move the overall rate if 90% of your players are on Windows. Platform-specific alerts ensure you catch regressions that affect a minority of your player base but are severe within that group.
Dashboard Design for Version Comparison
A crash tracking dashboard should answer five questions at a glance: what is the current crash-free rate, is it better or worse than the last version, what are the top crash signatures, which platforms are affected, and how many players are impacted.
// Dashboard data query - compare current vs previous version
SELECT
version,
COUNT(*) AS total_sessions,
SUM(CASE WHEN crashed = 0 THEN 1 ELSE 0 END) AS clean_sessions,
ROUND(
SUM(CASE WHEN crashed = 0 THEN 1 ELSE 0 END) * 100.0 / COUNT(*), 2
) AS crash_free_rate
FROM sessions
WHERE version IN ('1.4.1', '1.4.2')
AND created_at >= NOW() - INTERVAL 48 HOUR
GROUP BY version
Place the crash-free rate front and center as a large number with a delta indicator showing the change from the previous version. Green arrow up means improvement, red arrow down means regression. Below it, show a time-series chart of the daily rate for the current and previous versions on the same axes so the team can see the trend at a glance.
List the top five crash signatures for the current version, ranked by occurrence count. Each signature should link to the grouped crash report so a developer can click through and start investigating immediately. Show the platform breakdown as a simple bar chart or table. If one platform’s crash-free rate is significantly lower than the others, it should be visually obvious.
Connecting Crashes to Bug Reports
Crash telemetry is most valuable when it connects to your bug tracking workflow. When a crash signature appears for the first time in a new version, automatically create a bug report with the crash details, affected version, platform, session count, and a link to the first occurrence. When the crash rate for a known signature increases after a release, reopen the associated bug and escalate its priority.
Include the session ID in crash reports so developers can pull the full telemetry timeline for a crashed session: what scenes the player visited, what actions they took, and what errors were logged before the crash. In Bugnet, session metadata attaches automatically to crash reports, giving developers the complete context without asking the player for additional information.
“Before we tracked crash-free rate by version, we shipped a patch that introduced a rare crash on Linux. It took two weeks for the first player report to come in. Now our dashboard flags regressions within 24 hours of release, and we can hotfix before most players even update.”
Related Resources
For setting up the crash reporting infrastructure, see automated crash reporting for indie games. To learn about grouping and deduplicating crash reports, read stack trace grouping and crash deduplication. For metrics beyond crash rate, check out bug reporting metrics every game studio should track.
Add session start and end events to your game this week. Within 48 hours of your next release, you will know exactly whether the build improved or regressed stability.