Why use an error rate instead of an error count?

Because a raw count is contaminated by scale, more errors might just mean more players. The error rate strips that out by dividing by usage, so a rising rate means errors are genuinely happening more often per unit of play, a real problem, while a flat rate with a higher count just means more players. The rate distinguishes 'more broken' from 'more busy.'

How do error rates help with monitoring?

Error rates are the natural basis for thresholds and alerts because they are scale-independent: you can set a meaningful threshold that holds whether you have few players or many, so a rate-based alert fires when reliability genuinely degrades rather than just when traffic rises. Watching the rate's trend and comparing it across versions also reveals regressions hidden in rising volume.

What Is an Error Rate in Games?

Q: What is an error rate?

The frequency of errors normalized against a base like sessions, requests, or time, for example errors per session. Normalizing by usage makes it comparable over time and across scale, so it measures how often things go wrong relative to activity, not just the raw count, which would rise simply because more players generate more errors.

Quick answer: An error rate measures how often errors occur relative to a unit of usage, such as errors per session, per request, or per hour. Normalizing the raw error count by usage makes it comparable over time and across scale, so a rising error rate signals a real increase in how often things go wrong, independent of how much the game is being used.

Counting errors is useful, but a raw count can mislead: more errors might just mean more players. An error rate fixes this by normalizing, expressing errors relative to usage, so you measure how often things go wrong, not just how many times. Whether it is errors per session, failed requests per total requests, or crashes per hour, the error rate is a fundamental monitoring metric that tells you the true frequency of problems, independent of scale. Understanding error rates is key to reading your monitoring data correctly.

What an Error Rate Is

An error rate is a count of errors divided by a base unit of usage: errors per session, errors per request, errors per hour, crashes per thousand sessions. The normalization is the point. A raw error count rises and falls with how much the game is being played, twice as many players will, all else equal, produce twice as many errors, even if nothing got worse. Dividing by usage strips that out, leaving a measure of how frequently errors occur per unit of activity.

This makes the error rate comparable across different scales and times. An error rate of 'X errors per session' means the same thing whether you have a hundred sessions or a million, so you can track it as your player base grows and know whether things are actually getting better or worse, not just busier. The rate is the honest measure of error frequency; the raw count is contaminated by scale.

Why Error Rates Matter

Error rates matter because they reveal real changes in reliability that raw counts hide. A spike in total errors might just be a traffic spike; a spike in the error rate (errors per session) means errors are genuinely happening more often per unit of play, a real problem. Watching the rate rather than the count is how you distinguish 'more players' from 'more broken,' which is essential for not panicking over benign growth or, worse, missing a real regression hidden in rising volume.

Error rates are also the natural basis for monitoring thresholds and alerts. Because the rate is scale-independent, you can set a meaningful threshold ('alert if the error rate exceeds X') that holds whether you have few players or many. A rate-based alert fires when things genuinely get worse, not just when you get busier, making it a reliable signal. This is why error rate, not raw error count, is the standard metric for reliability monitoring.

Tracking Error Rates in Your Game

To track an error rate, you need to count both errors and the normalizing base (sessions, requests, time), then watch their ratio over time and across versions. The most valuable views are trend (is the rate rising or falling?) and comparison (does the new version have a higher error rate than the old, a regression?). A rising error rate, or a version with a worse rate, is your signal that reliability has degraded.

Bugnet's crash and error reporting captures the errors and crashes, grouped and counted, and tagged by version, which combined with session data lets you express error frequency as a rate rather than a raw count. This turns your monitoring honest: you can see whether a rise in errors reflects a real increase in how often things go wrong (a rising per-session rate) or just more players (flat rate, higher count), and whether a new release degraded reliability (higher error rate on the new version). Watching the error rate, not just the count, alongside grouped issues that show which specific errors are driving it, is what lets you respond to genuine reliability problems while staying calm about benign growth.

An error rate normalizes errors by usage, errors per session, not just a raw count. It tells you if things are more broken, not just busier.