Quick answer: Write crash dumps from an out-of-process handler (Crashpad or Breakpad), upload them on the next game launch using pre-signed URLs, retry with exponential backoff, deduplicate by stack-hash fingerprint, and apply rate limits both client- and server-side. Ask once for consent and always strip PII before upload.
Crash dumps are the single highest-value signal you can collect from shipped games. A minidump gives you a stack trace, loaded modules, register state, and usually enough context to reproduce a bug that a thousand players saw but nobody could describe. The catch: the code that uploads the dump runs in an environment that just proved itself unreliable, and it runs on the machines of players who did not sign up to be your telemetry endpoint. A good upload system gets data reliably, respects player resources, and fails silently when the network is not available.
Never Upload From the Crashing Process
The first rule is that a crashing process cannot be trusted to run network code. The stack may be corrupted, memory may be invalid, and the threading state is in whatever shape the crash left it. Google’s Crashpad and its predecessor Breakpad exist specifically to solve this: they spawn a tiny helper process that stays running alongside the game. When the game crashes, the helper writes a minidump to disk from an out-of-process signal handler. Your game never calls into networking code during the crash.
The uploader runs on the next game launch. It scans a known crash directory, finds any unsent dumps, and uploads them before the player sees the main menu. If upload fails, the dump stays on disk and the uploader tries again next time.
void InitCrashHandler() {
base::FilePath dumps = GetCrashDumpDirectory();
crashpad::CrashpadClient client;
client.StartHandler(
GetHandlerExePath(),
dumps,
"", // metrics_dir
"https://example.invalid", // unused, we upload manually
GetAnnotations(),
{ "--no-upload-gzip" },
true, // restartable
false // asynchronous_start
);
}
We disable Crashpad’s built-in uploader (the https://example.invalid sentinel) because we want full control over batching, signed URLs, and rate limiting.
Use Pre-Signed Upload URLs
Do not upload directly to your API server. Minidumps are large (1–50 MB), and streaming them through your application servers wastes CPU and bandwidth. Instead, ask your backend for a pre-signed S3-compatible URL, then upload straight to object storage.
The backend issues a URL that is valid for five minutes, scoped to a specific object key, and signed with a server-only credential. The client gets the URL, uploads, and posts the resulting object key back so the backend can enqueue symbolication.
async def upload_dump(dump_path, meta):
async with aiohttp.ClientSession() as s:
signed = await s.post("/api/crash/signed-url",
json=meta)
url = (await signed.json())["url"]
key = (await signed.json())["key"]
with open(dump_path, "rb") as f:
resp = await s.put(url, data=f,
headers={"Content-Type": "application/octet-stream"})
resp.raise_for_status()
await s.post("/api/crash/confirm",
json={"key": key, **meta})
Retry With Exponential Backoff
Network failures are normal. A player on hotel wifi, a flaky cellular connection, or a transient server error should not cost you a crash dump. Wrap the upload in exponential backoff with jitter: retry after 1s, 2s, 4s, 8s, 16s, capping at 60 seconds. After five failures in one session, give up and leave the dump on disk for the next launch. On the next launch, treat it as a fresh attempt.
Add a cap on total age. A dump that has been sitting on disk for 30 days is almost certainly stale; the symbols for that build may have been cleaned up, and the player has long since forgotten the crash. Delete dumps older than your symbol retention window.
Deduplicate by Fingerprint
If a crash happens in a tight render loop, the game may produce the same dump ten times before the player force-quits. Uploading ten identical 20 MB dumps wastes bandwidth and storage without telling you anything new. Compute a fingerprint from the top five frames of the stack (function names plus offsets), store it in a local counter file, and refuse to upload more than three dumps with the same fingerprint per 24 hours.
Back this up server-side. The confirm endpoint should also rate-limit by fingerprint per user ID, so a user who reinstalls and loses their local counter cannot flood you with duplicates. Return a 429 and log the suppression rather than processing the dump.
Respect Player Privacy
Minidumps contain information players did not explicitly agree to share: user paths embedded in loaded module names, process command lines, potentially injected third-party DLLs (antivirus, overlays, cheats). Before upload, scrub obvious PII. The simplest pass is to replace anything that looks like a home directory path with a placeholder:
func ScrubPaths(dump []byte) []byte {
re := regexp.MustCompile(`(/home/|/Users/|C:\\Users\\)[^/\\]+`)
return re.ReplaceAll(dump, []byte("/USER"))
}
Show a consent prompt the first time the game launches after a crash. Tell the player what you collect, link to a plain-language privacy page, and let them opt out permanently. Store the choice server-side tied to their account so it follows them across reinstalls. An opt-out should delete any pending dumps on the local disk immediately.
Monitor the Pipeline
Finally, instrument the uploader. Track upload success rate, time to upload, dedup hit rate, and 24-hour volume. A sudden drop in uploads is either a backend regression or a client-side bug that blocks the helper process. A sudden spike is either a new crash in the field or a dedup failure. Build dashboards that show all four numbers, and alert on changes larger than twenty percent week over week.
“We shipped our first crash uploader without dedup. One crash in our shader cache produced 400 dumps from 50 players in the first hour. Our storage bill that day was larger than the previous month. Fingerprint dedup was the next day’s hotfix.”
Related Issues
For a broader look at crash reporting strategy, see automated crash reporting for indie games. For handling crash rate trends over time, read how to track and reduce crash rate over releases.
Instrument your upload pipeline with four metrics today: success rate, time to upload, dedup rate, and 24-hour volume. You cannot fix what you cannot see.