Quick answer: Check the return of Present (and key calls) for DXGI_ERROR_DEVICE_REMOVED / RESET. On that, tear down and recreate the D3D11 device and every GPU resource.
A DX11 game crashes for some players after a GPU driver update or a TDR (timeout detection & recovery). The device became invalid and the code kept using it.
Detect Device Loss
HRESULT hr = swapChain->Present(1, 0);
if (hr == DXGI_ERROR_DEVICE_REMOVED || hr == DXGI_ERROR_DEVICE_RESET)
{
HRESULT reason = device->GetDeviceRemovedReason();
LogDeviceRemoved(reason);
RecreateDevice();
return; // skip the rest of this frame
}
GetDeviceRemovedReason tells you why (hung, reset, driver upgrade, internal error) — log it for crash reports.
Recreate Everything
A removed device invalidates all child objects — buffers, textures, shaders, render targets, the swap chain. RecreateDevice must:
- Release every GPU resource and the device/context/swap chain.
- Recreate the device and swap chain.
- Reload/recreate all resources from CPU-side source data.
This means you must keep (or be able to re-load) the source data for everything on the GPU.
TDR Is the Common Trigger
A GPU hang >2s triggers Windows TDR, which removes the device for every running app. Your code can’t prevent TDR — only recover gracefully from it.
Verifying
Simulate device loss in dev (the D3D debug layer can force it). The game recreates the device and resumes rendering instead of crashing. Crash telemetry shows the removed-reason logged, not a hard crash.
“Device removed invalidates everything. Detect on Present, tear down, recreate device and all resources.”
Engines (Unreal, Unity) handle this internally — this is a hand-rolled-renderer concern. If you wrote the D3D layer, you own the recovery.