Quick answer: An error message that says 'Network: connection refused: matchmaker.us-east' tells the on-call who to escalate to. 'Something went wrong' tells them nothing.

The right error message is half the on-call response. Tag the system, the action, and the recoverable state.

Format: System: Action: State

Matchmaker: Find: NoServerAvailable. Three tokens. The on-call greps the runbook for the first; the action gives them the next step; the state tells them the user impact.

Embed the recovery action

'Retry available in 30s' or 'Restart required'. Players act on the message; support has half a ticket pre-filled.

Don't leak internals to players

Players see 'Couldn't reach the server. Trying again...' Support sees 'Matchmaker: Connect: TimeoutExceeded'. Same event, different audience.

Catalogue error codes

One spreadsheet. Error code, user message, support context, severity. Updated every time you add a new error. Half the value of the system is the catalogue.

“Errors are conversation starters. The message decides who you're talking to.”

Make error catalogue maintenance a release-blocker checkbox. A new error in production without a catalogue entry is a debt that gets harder to pay later.