Quick answer: Crashes that only happen in production are caused by real-world conditions you can't replicate in testing: device diversity, scale and concurrency, unexpected inputs, real network conditions, and production data. The field has conditions your test environment doesn't.
Some crashes never happen in your testing but plague players in production. Understanding why helps you catch them. Here's what causes crashes that only happen in production.
Why Production Is Different
Production, real players in the real world, has conditions your controlled test environment doesn't, and crashes that depend on those conditions only appear there.
- Device and hardware diversity, the thousands of device, GPU, OS, and configuration combinations players use that you can't test
- Scale and concurrency, many simultaneous players exposing race conditions and resource issues that don't appear at small scale
- Unexpected player inputs and behavior, players doing things you didn't anticipate or test
- Real network conditions, latency, packet loss, and instability absent on your local network
- Production data and state, real saves, accounts, and data in states your test data isn't in
- Real load, server and system load that only occurs with real traffic
The common thread is that production has conditions, diversity, scale, real inputs, that your test environment can't replicate, so crashes needing them only happen there.
Why You Can't Test Them Away
You fundamentally can't replicate production in testing, you don't have every device, can't simulate real scale and behavior, and can't reproduce every real-world condition. So a class of crashes will always only appear in production, no matter how much you test. This is a structural limit, not a testing failure.
Bugnet captures crashes from real players in production with full context, so the crashes that only happen there surface diagnosably. Accepting that production crashes are inevitable shifts the strategy to capturing them, rather than trying to test them away.
Finding and Fixing Production-Only Crashes
Since they only happen in production, you find them by capturing them there: crash reporting from real players with the stack trace, device, version, and context that reveal the production conditions behind each crash. Then you fix from that data and the patterns across occurrences, often without reproducing locally.
Bugnet captures production crashes with context and groups them, so you can find and fix crashes you can't reproduce. So crashes that only happen in production are caused by real-world conditions you can't replicate, and finding them means capturing them from production rather than relying on testing.
Production-only crashes come from real-world conditions you can't replicate, device diversity, scale, real inputs, network conditions, production data. They're a structural limit of testing, so capture them from production with context and fix from the data.