Quick answer: A telemetry data dictionary is a YAML file in your repo listing every event you emit, with schema, purpose, owner, and an example payload. Validate it in CI. Version events so renames don’t break downstream dashboards. The dictionary is the only source of truth — code must conform to it, not the other way around.

The retention chart on your dashboard has been flat for three weeks. An analyst digs in and discovers the level_complete event now fires when the player enters the level, not when they finish it. No one remembers who made the change. No one wrote it down. The dashboard is lying, and has been since the 1.4.0 patch — because the team had no contract describing what level_complete means.

What a Data Dictionary Solves

A data dictionary is the contract between the code that emits telemetry and the humans and systems that consume it. Without one, the schema drifts quietly. A developer rewrites an event and renames playerId to user_id. A gameplay programmer adds a property and forgets to document it. Analytics rebuilds a dashboard on the new property, which disappears the next release because it was never officially supported.

The dictionary answers four questions for every event: What is this? When does it fire? Who owns it? What does a real payload look like? Those answers prevent the decay of your analytics pipeline from a trusted source of truth into a lossy folklore system.

Pick a Format That Lives in the Repo

Plain-text, human-readable, diffable. YAML is the practical choice. JSON is fine if your tooling prefers it. Avoid spreadsheets — they don’t diff, they don’t version cleanly, and they drift out of sync with the code.

# telemetry/events/level_complete.yaml
name: level_complete
version: 2
owner: gameplay-team
description: |
  Fires the instant the player reaches the level's exit trigger.
  Does NOT fire on restart, retry, or developer skip.
trigger: ExitTriggerEnter handler in LevelController.cs
properties:
  level_id:
    type: string
    required: true
    description: Canonical slug, e.g. "forest_01"
  duration_seconds:
    type: float
    required: true
    description: Elapsed in-game time since level_start
  deaths:
    type: int
    required: true
example:
  level_id: "forest_01"
  duration_seconds: 183.4
  deaths: 2
changelog:
  - "v2 (2026-03-12): added deaths property"
  - "v1 (2025-11-01): initial version"

One file per event scales better than one master file once you pass 20 events. Use a top-level events/ directory and group by domain (gameplay/, store/, session/) as it grows.

Wire It Into CI

The dictionary is worthless if it drifts from the code. Add a CI job that parses every Track() call in your codebase and validates the event name and property set against the dictionary. Unknown events fail the build. Missing required properties fail the build. Extra properties trigger a warning.

Language-specific approaches: in C#, use a source generator that reads the dictionary and produces a strongly-typed Track.LevelComplete(string levelId, float duration, int deaths) method. Then the compiler enforces the contract, not a post-hoc linter. In scripted languages (GDScript, Lua), a unit test iterates every event call site and validates at test time.

Version, Don’t Rename

The worst telemetry change is renaming a field. Every downstream dashboard, every SQL query in the wiki, every warehouse job breaks silently. The discipline is: never rename a field. If the semantics change, add level_complete_v2 as a new event and emit both for a migration window. Drop the v1 event only when every consumer has migrated.

Same for breaking schema changes. Adding an optional property is fine. Making a property required, changing its type, or changing its units is a new event version. The changelog in the YAML file shows the history; the event name shows the current contract.

Consumer Contracts

Every event has at least one consumer: the retention dashboard, the economy team, a live-ops alert. List them in the YAML. When you’re about to change an event, the consumer list tells you who to notify. When a consumer breaks, the dictionary tells you whether their query was built on a supported field or a property they scraped.

consumers:
  - name: retention_dashboard
    owner: analytics-team
    uses_properties: [level_id, duration_seconds]
  - name: new_player_funnel_report
    owner: live-ops
    uses_properties: [level_id]

PII and Retention

Tag every property that might contain personally identifiable information. The dictionary is the right place to specify whether a property is stored in the long-term warehouse, redacted after 30 days, or never persisted at all. The data platform can read these tags and enforce retention automatically. Auditors can read them when you get asked “what personal data do you collect?” — and the answer is the dictionary, not a PDF written three years ago.

Keep It Readable

The dictionary is a reference document that designers, analysts, and QA will read more often than the engineers who write it. Put effort into the descriptions. Use real examples with plausible values. Link to the dashboards that depend on each event. A well-maintained dictionary becomes the single best onboarding document for new hires trying to understand your product.

“If an event isn’t in the dictionary, it doesn’t exist. Consumers must not build on undocumented fields, and producers must not emit undocumented events.”

Related Issues

For picking which events matter in the first place, see how to instrument games for useful analytics. For the reporting side, see the anatomy of a good bug report.

A data dictionary isn’t paperwork — it’s the difference between trusting your dashboards and rebuilding them every quarter.