Quick answer: Measuring stability means turning a vague sense of reliability into tracked numbers. The core metric is your crash-free session rate, supported by crash and error frequency, time between failures, and the share of sessions affected by your top issues. Slice each by platform and build version so regressions stand out, track them over releases, and judge progress by trend rather than a single absolute number you compare against nobody.
Ask most indie developers how stable their game is and you will get a feeling, not a figure. It seems fine, or it has been crashing a lot lately. Feelings are a poor basis for decisions, especially when stability quietly drives retention and reviews. Measuring it properly means defining a handful of concrete metrics, tracking them across builds and platforms, and watching the trend so you can tell whether your last patch helped or hurt. This guide lays out which stability metrics matter for a game, how to slice them so they are diagnostic, and how to read them without fooling yourself.
Crash-free session rate is the headline number
If you measure one thing, measure the share of play sessions that end cleanly rather than in a crash. The crash-free session rate is the headline stability metric because it maps directly onto player experience: it is the probability that a given session does not blow up in someone's face. It is intuitive, it trends meaningfully over time, and it gives the whole team a single number to rally around. Most other stability metrics are ways of explaining movements in this one.
Define a session clearly so the metric is consistent, then track the rate continuously. The absolute value matters less than the direction. A rate that is climbing release over release tells you your stability work is paying off, while one that dips after a patch is an early warning to investigate before players feel it. Treat this number as a vital sign you watch every release, the way you watch downloads or revenue, and it will catch reliability problems long before they show up as churn or angry reviews.
Frequency and time between failures
The crash-free rate tells you how often sessions fail, but two supporting metrics add texture. Crash and error frequency, how many failures occur per some unit of play time, helps you see whether problems are rare flukes or constant background noise. A game can have a tolerable crash-free rate while still subjecting active players to frequent minor errors that erode the experience, and frequency surfaces that where the session rate alone might hide it.
Time between failures is the same information from the player's perspective. How long, on average, can someone play before something goes wrong? A short interval means players hit problems within a typical session, which is far more damaging than the same number of failures spread thinly across long sessions. Tracking the interval keeps you honest about lived experience rather than aggregate counts. Together, frequency and time between failures explain the texture of your stability, telling you not just how often things break but how that breakage lands on a real player mid-session.
Share of sessions hit by your top issues
A powerful and underused metric is how much of your instability comes from a small number of issues. Often a handful of top crashes account for the large majority of all failures. Measuring the share of bad sessions attributable to your top few issues tells you how concentrated the problem is, and concentration is good news, because it means a small number of fixes can move your overall stability a lot. This metric turns an overwhelming crash list into a focused plan.
Tracking this share also keeps your effort honest. After you fix a top issue, the share controlled by the remaining top issues should shift, confirming your fix removed a real chunk of instability. If the headline crash-free rate barely moves after you fixed what you thought was the biggest problem, the metric is telling you the problem was more distributed than you assumed. This share-of-sessions view is what connects individual bug fixes to the headline number, so you can see the impact of your work rather than hoping it mattered.
Slice by platform and build so regressions show
Every stability metric lies a little when aggregated. A healthy overall crash-free rate can hide one platform falling apart or one build regressing badly, because the broad average smooths the disaster away. The fix is to slice every metric by platform and by build version. Per-platform numbers reveal that, say, one operating system or GPU family is having a far worse time than your average suggests, which is exactly the kind of issue that aggregate numbers bury until it becomes a flood of reviews.
Slicing by build version is what makes regressions visible. When you cut stability per release, a new build that quietly made things worse stands out immediately as a drop against the previous one, instead of being averaged into a population still dominated by the older, stabler build. This is the difference between catching a regression in days versus weeks. Always look at the segments, because the segments are where the actionable truth lives, and the headline number is only useful once you have confirmed it is not hiding a localized catastrophe.
Setting it up with Bugnet
Bugnet supplies the raw material these metrics need. It captures crashes from inside your build with stack traces and the device and platform context that lets you slice cleanly, so per-platform and per-build stability is something you can actually compute rather than estimate. Because reports carry the build version and player attributes, cutting your numbers by segment is straightforward, and the regressions that aggregate figures hide become visible against the previous release. The data arrives already tagged with everything you need to measure stability honestly.
Occurrence grouping is what makes the share-of-sessions metric possible. By folding duplicate crash reports into one issue with a count, Bugnet shows you exactly how concentrated your instability is and which few issues dominate it. You can rank crashes by how many sessions they affect and watch the headline rate respond as you fix from the top. With crashes, counts, and context in one dashboard, stability stops being a feeling you describe and becomes a set of numbers you track and improve release over release.
Judge by trend, not a magic number
The biggest trap in measuring stability is fixating on an absolute target borrowed from somewhere else. The right number for your game depends on its genre, platforms, and audience, and comparing yourself to a figure from a different kind of game tells you little. What matters is your own trend. Is each release more stable than the last? Are your top issues shrinking as a share of failures? Progress against your own history is a far more useful judgment than a benchmark invented for someone else's title.
Build the habit of reviewing these metrics every release, ideally on a dashboard the whole team sees, and treat a regression as something to investigate before shipping further. Stability measured this way becomes a steering instrument rather than a postmortem tool. You catch problems while they are small, you confirm your fixes worked, and you accumulate a record of steady improvement. None of this requires a big team, only the discipline to turn a vague feeling into a few tracked numbers and to actually look at them.
Turn stability from a feeling into tracked numbers, slice them by platform and build, and judge progress by your own trend rather than a borrowed target.