Quick answer: Build a one-page scorecard with five metrics — crash rate, median session length, day-1/day-7 retention, FPS p95, and open critical bug count. Set red/yellow/green thresholds based on historical data, review it weekly, and assign owners to every yellow or red item. The scorecard turns vague “game feels buggy” conversations into concrete, actionable numbers.
Every game studio has a gut feeling about their game’s health. The scorecard replaces gut feelings with data. It should fit on one screen, take under five minutes to read, and make it immediately obvious whether the game is getting better or worse. Here’s how to build one that your team will actually use.
The Five Core Metrics
Start with exactly five metrics. More than five and the scorecard becomes a dashboard — dashboards are browsed, scorecards are acted on. You can always add metrics later, but start lean.
1. Crash rate. Crashes per 1,000 sessions. This is your most important stability signal. Pull it from your crash reporter (Bugnet, Crashlytics, Sentry) and break it down by platform. A single aggregate number hides platform-specific problems — your Android crash rate might be three times your iOS rate, and the aggregate looks fine because iOS dominates your install base.
2. Median session length. Not average — median. Averages are skewed by AFK sessions and overnight idles. Median gives you the typical player’s engagement. If median session length drops after a patch, something in the new content is pushing players out earlier.
3. Day-1 and day-7 retention. What percentage of new players return the next day, and what percentage return after a week. Retention is a lagging indicator — it takes seven days to calculate D7 — but it is the most reliable signal of whether your game is actually fun and stable enough to keep players coming back.
4. FPS p95. The 95th percentile frame time across all active sessions. This tells you the experience of your worst-off five percent of players. If the median FPS is 60 but p95 is 22, you have a significant population hitting severe performance issues. Track this per platform.
5. Open critical bug count. The number of unresolved bugs marked “critical” or “blocker” in your tracker. This is the one metric that is entirely within your team’s control. If the number is going up, you are creating problems faster than you are fixing them.
Setting Thresholds
Each metric needs three thresholds: green (healthy), yellow (investigate), and red (act immediately). The exact numbers depend on your game, genre, and platform, but here are reasonable starting points for a mid-core PC/console title.
// scorecard_thresholds.json
{
"crash_rate_per_1k": {
"green": "< 5",
"yellow": "5 - 20",
"red": "> 20"
},
"median_session_minutes": {
"green": "> 25",
"yellow": "15 - 25",
"red": "< 15"
},
"retention_d1_percent": {
"green": "> 40",
"yellow": "25 - 40",
"red": "< 25"
},
"fps_p95": {
"green": "> 30",
"yellow": "20 - 30",
"red": "< 20"
},
"open_critical_bugs": {
"green": "< 5",
"yellow": "5 - 15",
"red": "> 15"
}
}
Calibrate these against your own historical data. Pull three months of crash rate numbers and find the median — that’s roughly where your green/yellow boundary should be. The red threshold should be a number that has historically correlated with player complaints or review score drops. Don’t set thresholds aspirationally; set them so that yellow actually means “this is worse than normal” and red means “this is worse than we’ve ever been comfortable with.”
Building the Scorecard View
The scorecard should be a single page — a static HTML file, a Notion page, a Google Sheet, whatever your team already looks at. The key constraint is that it fits on one screen without scrolling. Each metric gets a row: name, current value, trend arrow (up/down vs. last week), and the color indicator.
<!-- Minimal scorecard layout -->
<table class="scorecard">
<thead>
<tr>
<th>Metric</th>
<th>Current</th>
<th>Prev Week</th>
<th>Trend</th>
<th>Status</th>
</tr>
</thead>
<tbody>
<tr>
<td>Crash Rate (per 1k)</td>
<td>3.2</td>
<td>4.1</td>
<td>↓</td>
<td class="green">GREEN</td>
</tr>
<!-- Repeat for other metrics -->
</tbody>
</table>
If you use Bugnet, the game health dashboard already aggregates crash rate, bug count, and performance snapshots. You can pull the data via the API and generate the scorecard automatically as part of a weekly cron job.
The Weekly Review Meeting
Schedule a 20-minute meeting every Monday. The format is rigid: walk through the five metrics top to bottom, note any color changes from last week, and for every yellow or red item, assign an owner and a deadline. No discussions about feature work, no product roadmap — just health.
The meeting should produce exactly one artifact: an updated scorecard with action items. Post it in a shared channel so people who could not attend can see what changed. If every metric is green and trending stable, cancel the meeting for that week. No one needs to sit through twenty minutes of “everything is fine.”
“The scorecard meeting isn’t about celebrating green metrics. It’s about catching yellow before it turns red. If you only look when things are already on fire, the scorecard isn’t doing its job.”
Connecting Metrics to Action
A metric without a response plan is just a number. For each metric, define what happens at each threshold. Crash rate goes yellow? The on-call engineer pulls the top three crash signatures and files bugs. FPS p95 drops below 30? The performance engineer profiles the five slowest scenes and opens optimization tickets. Open critical bugs exceed fifteen? The team pauses feature work for a bug-fix sprint.
Write these response plans down and include them on the scorecard page itself. When a metric goes red at 2 AM, the person checking the scorecard should know exactly what to do without waiting for a meeting. This is especially important during launch windows when the team is most stressed and least likely to think clearly about process.
Evolving the Scorecard
After two months, you will know whether your five metrics are the right five. Common additions include: memory peak on console (critical for certification), load time p95 (affects first impressions), and matchmaking success rate for multiplayer games. Remove a metric before adding one — the scorecard should never exceed seven items. If a metric has been green for twelve consecutive weeks and has no plausible failure mode, consider replacing it with something more informative.
Version your threshold definitions. When you ship a major expansion that changes the game’s performance profile, update the thresholds to reflect the new baseline rather than letting the scorecard flash yellow permanently because the old thresholds no longer apply.
Related Issues
For tracking build size as an additional health metric, see How to Track and Reduce Game Download Size. If your crash rate metric is spiking during live events, check How to Handle Player Reports During Live Events.
Five metrics, one page, twenty minutes a week. That’s the entire system.