Quick answer: Define 12–30 high-value event types tied to specific questions, build a batched HTTP sender with offline persistence, anonymize player IDs, get opt-in consent where required, and verify the pipeline end-to-end before launch. Use the data to find bugs and inform design, not to spy on players.
Crash reports tell you when the game broke. Custom telemetry tells you what the player was doing when it broke, what they were doing for the hour before that, and what they did differently from the player on the next machine over who did not crash. Without telemetry you are debugging in the dark. With it, you have a flashlight.
What Telemetry Is For
Telemetry is structured event data sent from the game client to your backend where you can query it. Each event records that something happened, when, and with what context. Examples:
level_started{level_id, attempt_number, time_since_session_start}level_completed{level_id, duration_seconds, deaths, score}level_failed{level_id, duration_seconds, cause, last_checkpoint}item_acquired{item_id, source, total_owned}tutorial_step{step_id, time_to_complete}
The events answer questions: which level has the highest fail rate, which item is most ignored, which tutorial step takes the longest. Each question is a possible bug or design issue.
Pick Your Events Carefully
The temptation when adding telemetry is to track everything. Resist it. Tracking everything creates noise, costs money, and produces a haystack you cannot search. Instead, work backward from questions.
For each event you consider, ask:
- What question does this event answer?
- If the answer is X, what would I do? If the answer is Y, what would I do differently?
- How often does this event need to fire to be useful?
If you cannot answer all three, do not track the event. A good telemetry system has 12–30 event types. A bad one has 200. The bad ones are full of player_moved and menu_clicked events that nobody ever queries.
Schema and Versioning
Each event needs a fixed schema: a name, a set of named properties, and types for each property. Document the schema in a versioned file that ships with the game.
{
"name": "level_completed",
"version": 2,
"properties": {
"level_id": "string",
"duration_seconds": "number",
"deaths": "number",
"score": "number",
"used_hint": "boolean"
}
}
Bump the version when you add or remove a property. The backend can then handle multiple schema versions in flight, which always happens because old clients linger for months after a release.
The Sender
Events flow from game code through a sender that batches and uploads them. The sender has a few requirements:
- Non-blocking. Calling
Telemetry.Track()from gameplay code must return immediately. Network I/O happens on a background thread. - Batched. Send N events per request (e.g. 50) or after T seconds (e.g. 30), whichever comes first. One request per event wastes both bandwidth and battery.
- Resilient. If the request fails, retry with exponential backoff. After a few failures, persist to disk and try again on the next session.
- Bounded. Cap the in-memory queue and the on-disk buffer. If they fill up, drop oldest events first — never let telemetry consume unbounded memory.
public class TelemetrySender
{
private readonly ConcurrentQueue<Event> _queue = new();
private const int BATCH_SIZE = 50;
private const float FLUSH_INTERVAL = 30f;
public void Track(string name, Dictionary<string, object> props)
{
if (_queue.Count > 5000) {
_queue.TryDequeue(out _); // drop oldest
}
_queue.Enqueue(new Event {
name = name,
props = props,
timestamp = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
session_id = SessionId,
player_id = AnonymousPlayerId,
game_version = Application.version
});
}
private async Task FlushLoop()
{
while (!_cts.IsCancellationRequested)
{
var batch = new List<Event>();
while (batch.Count < BATCH_SIZE && _queue.TryDequeue(out var e))
batch.Add(e);
if (batch.Count > 0)
await SendBatch(batch);
await Task.Delay(TimeSpan.FromSeconds(FLUSH_INTERVAL));
}
}
}
Privacy First
Telemetry can become surveillance fast. Stay on the right side of the line.
- Anonymize player IDs. Generate a random UUID on first launch and store it locally. Never tie it to an email, account name, or device IMEI.
- Never log free-form text. Player names, chat messages, search queries, custom level titles — none of these belong in telemetry.
- Get consent where required. EU and California laws require opt-in or opt-out depending on jurisdiction. Add a clear toggle in your settings menu. The toggle must be honored before any event leaves the device.
- Document what you collect. Your privacy policy should list every event type and what data it contains. Players who care will check.
- Make opt-out painless. A player who turns telemetry off should still be able to play the entire game. No locked-out features, no nagging dialogs.
The Backend Side
Your backend ingests events into a queryable store. Options:
- Managed analytics services (Mixpanel, Amplitude, PostHog). Easiest to set up, fixed schema, costs scale with event volume.
- Data warehouse pattern (BigQuery, Redshift, Snowflake). Cheap at scale, full SQL, requires more setup.
- Self-hosted (ClickHouse, Druid). Cheapest at scale, most control, you operate the database.
For a small studio, start with a managed service and migrate when costs become uncomfortable. Do not over-engineer the backend on day one.
Find Bugs You Would Never Have Found
Once you have a few weeks of data, start asking questions. Some real examples from games shipped with telemetry:
- “Why does level 7 have a 60% fail rate?” Because the time limit is too tight for the new boss attack pattern added in last patch.
- “Why do 30% of players never finish the tutorial?” Because the third tutorial step requires a button that does not exist on a Steam Deck’s default control scheme.
- “Why do players on Linux have 3x the crash rate of Windows players?” Because of a specific shader variant that is broken on Mesa drivers.
- “Why does session length drop after the first hour?” Because the auto-save fails silently for players with non-ASCII characters in their Windows username.
None of these would surface in a crash report or a QA pass. Telemetry made them visible.
“Telemetry is the second-best debugger after a profiler. The questions it answers are the questions you did not know to ask.”
Related Resources
For broader monitoring patterns, see how to monitor game stability after launch. For building dashboards from the data, see how to build a game health dashboard. For complementary crash data, see how to build a crash report deduplication system.
Start with the questions, not the events. Every event you track should exist because there is a specific question you want to answer with it.