Quick answer: Leaderboard anomalies require a layered approach: server-side validation to reject physically impossible scores, session log evidence to verify that a gameplay session actually occurred, statistical analysis to flag outliers, and a human review process for borderline cases. The goal is to catch cheating without punishing legitimate players who found a strategy you didn’t expect.
A corrupted leaderboard is one of the most visible ways a game can fail its competitive players. When a score that is clearly impossible sits at the top of the rankings, it signals to every legitimate player that the competition is broken and their effort is pointless. At the same time, over-aggressive anti-cheat systems that remove legitimate records create a different kind of damage: a developer that punishes its best players for being too good generates the same community resentment as one that ignores cheating. Getting this balance right requires more engineering than most indie developers budget for, but the fundamentals are accessible even for small teams.
The Core Challenge: Exceptional Performance Looks Like Cheating
The fundamental difficulty of leaderboard integrity is that the scores you most need to verify — the outliers at the top of the distribution — are also the scores that could belong to your best players. A completion time that is 40% faster than the second-place player might be a speedrunner who spent 200 hours optimizing their route. It might also be a memory editor that set the completion timer directly. These two cases can look identical from the outside.
This is not a hypothetical tension. Every competitive game with an active community has legitimate players who find strategies the developer never anticipated. Bunny hopping, frame-perfect input sequences, out-of-bounds shortcuts, and unintended collision exploits are part of the culture of competitive play. Some developers consider these bugs to be fixed; others consider them features that define the community. That decision is yours to make — but your leaderboard integrity system needs to be consistent with it. If you decide that a particular skip is a legitimate strategy, your validation system must not flag scores that use it.
The practical implication is that no automated detection system is sufficient on its own. Statistical outlier detection, session log analysis, and server-side validation are all inputs to a decision, not the decision itself. The final call on removing a score should always involve a human review, especially for scores that are near but not beyond the boundary of what is mechanically possible.
Server-Side Score Validation: The First Line of Defense
Client-side score validation — where the game client checks whether a score is valid before submitting it — is not useful for anti-cheat purposes. A cheating client can bypass any validation the client performs. Server-side validation means the server independently verifies the submitted score before it is written to the leaderboard.
The most basic form of server-side validation is range checking: does the submitted score fall within the range of values that are physically possible given the game’s rules? If your level has a minimum possible completion time of 90 seconds based on the map geometry and player movement speed, the server should reject any submission claiming a time below 90 seconds. This eliminates all trivial cheating (setting your score to 0, or to the maximum integer value) without requiring any heuristics.
More robust validation involves continuous game state verification. During a session, the client sends cryptographically signed snapshots of the game state at regular intervals — player position, score, collected items, elapsed time. The server verifies that each snapshot is consistent with the previous one: the score increase is within the maximum possible for the elapsed time, the player position is reachable from the previous position without teleportation, and the state transitions follow game logic. A cheated session that jumps from 10 points to 1,000,000 points in one interval, or that claims the player completed an area that requires 30 seconds in 2 seconds, fails this verification automatically.
Implement session tokens that tie each score submission to a specific game session. The server issues a token when a session starts and only accepts score submissions that reference a valid, active token. This prevents offline score manipulation (editing save files or memory and submitting the result) because the submission requires a token the server issued for a live session.
Logging the Game State That Produced a High Score
For scores that pass server-side validation but are still statistically anomalous, the most useful evidence is a complete log of the game state during the session. This is where Bugnet’s session logging fits naturally into a leaderboard integrity workflow.
Log the following data points throughout every competitive session:
- Player position and movement at regular intervals (every 2 to 5 seconds)
- Score increments with timestamps and the action that caused each increment
- Significant game events: level start, checkpoint passes, item pickups, enemy defeats
- The level seed or configuration if your game uses procedural generation
- Input events for key actions (so you can verify that a frame-perfect input actually occurred at the frame claimed)
- Elapsed time at each checkpoint
When a score is flagged for review, pull the session log from Bugnet. A legitimate speedrun will show a smooth progression through the level with position data that follows a coherent path, checkpoint times that add up to the total completion time, and input events that correspond to claimed mechanics. A cheated score often lacks a corresponding session in the logs entirely (because the game was not actually running for the duration implied by the score), or shows impossible state transitions like the player position jumping across the map without a teleport mechanic.
The absence of a session log is itself strong evidence. If a player submits a score claiming to have completed a 10-minute level in 8 minutes, and your Bugnet logs show no session from that player during the relevant time window, the score was not produced by a genuine playthrough. You can remove it with confidence.
Statistical Anomaly Detection: Finding Outliers Automatically
Manual review of every high score is not scalable. Statistical anomaly detection automates the first pass — flagging scores that are far enough from the expected distribution to warrant human review, without automatically removing them.
The simplest approach is standard deviation analysis. Calculate the mean and standard deviation of scores in a leaderboard segment (for example, the top 100 scores over the last 30 days). Flag any score that is more than 3 standard deviations above the mean for review. This is a rough heuristic and will generate false positives, but it is a useful triage tool that prioritizes your review queue without requiring manual monitoring of every submission.
More sophisticated approaches consider score velocity: how quickly a player’s ranking improves over time. A player who goes from rank 5,000 to rank 1 in a single session, with no prior high-rank sessions in the logs, is more suspicious than a player whose ranking has improved gradually over weeks of play. Score velocity analysis flags the sudden outliers while leaving consistent improvers alone.
Segment your analysis by player history. A new account submitting a top-10 score on their first recorded session is a stronger anomaly signal than an account with 200 logged sessions submitting the same score. Weight your anomaly scores accordingly.
False Positives: Speedrunners and Unexpected Strategies
Every leaderboard integrity system generates false positives. The question is how you handle them. Removing a legitimate score from a competitive player without warning or explanation is a community trust failure. It tells every player watching that excellent performance is treated with suspicion, which discourages the exact engagement you want your leaderboard to generate.
Build your review process around the assumption that flagged scores are probably legitimate until the evidence says otherwise. When a score is flagged automatically, the first action is investigation, not removal. Pull the session log, check the server-side validation results, look at the player’s history. Does the score look like the natural progression of a player getting better? Is there a known strategy or route that would produce this time?
Consult your community. Competitive players often know each other, and a top-ranked time in a well-known game is likely already documented somewhere — a stream recording, a Discord post, a Reddit thread. Before removing a score, spend five minutes searching for external evidence that it was achieved legitimately. This is not required for clearly impossible scores, but for borderline cases it is worth the effort and signals to your community that you take fairness seriously.
When you identify a legitimate strategy that your validation system flagged incorrectly, fix the validation rule before more players are affected. Document the strategy in your game’s internal knowledge base so future reviewers know it is legitimate.
The Appeals Process for Score Removal
Whenever you remove a score, notify the player with an explanation of why and a clear process for appealing the decision. The appeal process should require the player to provide evidence: a recording of the run, a session identifier from their game client, or an explanation of the strategy they used.
Cross-reference any evidence they provide against your session logs in Bugnet. If they provide a session ID, look it up and verify that the session data is consistent with their claimed time and strategy. If the logs confirm a legitimate session that your automated system incorrectly flagged, restore the score, apologize for the error, and fix the detection rule.
Even if you cannot restore a score, a responsive appeals process demonstrates that you take integrity seriously in both directions — against cheaters and against false positives. Players who feel they were treated fairly even when they disagree with a decision are far less damaging to your community than players who feel dismissed.
“A leaderboard that removes one legitimate world record does more damage to competitive engagement than one that lets through a handful of cheated scores. Players accept imperfect systems. They do not accept unfair ones.”
Building Score Replay Systems for High-Stakes Leaderboards
For games with a serious competitive scene — where leaderboard position has real community significance, associated prizes, or is featured in your marketing — a score replay system is the gold standard for verification. A replay system records the complete sequence of inputs during a session and stores them server-side. The server can then re-simulate the session deterministically from those inputs to verify the resulting score.
This approach is more engineering-intensive but provides near-perfect verification for any score it covers. If the server replay produces the same score as the submission, the score is almost certainly legitimate. If it produces a different score, the submission data was tampered with.
The key requirement is deterministic simulation: the same inputs must always produce the same game state. Many games are not fully deterministic because of floating-point differences across hardware, random number generators, or physics engines with platform-dependent behavior. If your game uses any of these, you will need to address them before a replay system is viable. For games that are already deterministic (turn-based games, games with discrete physics, games with seeded randomness), replay verification is relatively straightforward to implement.
Even a partial replay system — one that records key events rather than every input frame — provides substantially better evidence than statistical analysis alone. For the top N scores in any leaderboard where top placement matters to your community, requiring a replay submission is a reasonable policy that protects both the integrity of the rankings and the fairness of your review process.
Your best players deserve a leaderboard they can trust. Building that trust is an engineering problem, not just a moderation problem — and the right logging infrastructure makes both sides of it manageable.