What is a smoke test for a game build?

A smoke test is a minimal automated check that verifies a game build launches, loads key scenes, accepts basic input, and runs without crashing for a set duration. It catches catastrophic build failures before the build reaches QA or players.

How do you run a game in batch mode for CI testing?

Most engines support a headless or batch mode flag that launches the game without rendering to a window. In Unity, use -batchmode -nographics. In Godot, use --headless. This allows the game to run on CI servers without a GPU or display.

What exit codes should game smoke tests return?

Return exit code 0 for success and a non-zero code for failure. Use distinct codes for different failure types: 1 for crash, 2 for scene load failure, 3 for timeout. CI systems use exit codes to determine whether a build step passed or failed.

How to Build Automated Smoke Tests for Game Builds

Quick answer: Build four smoke tests that run on every CI build: a launch test that verifies the game starts without crashing, a scene load test that opens each critical scene and checks for errors, a basic input test that simulates player actions, and a no-crash duration test that lets the game run for 60 seconds. Use batch mode to run headless on CI servers and return distinct exit codes for each failure type.

Every game developer has shipped a build that does not launch. A missing asset reference, a broken script import, or a misconfigured scene tree can produce a build that compiles successfully but crashes the moment a player tries to run it. Manual QA catches these problems, but only after someone downloads the build, runs it, and reports back — a cycle that can take hours or days. Automated smoke tests catch them in minutes, on every build, before anyone touches the binary. They are not a replacement for playtesting. They are the safety net that ensures every build that reaches playtesters is at least capable of running.

The Launch Test

The most basic smoke test verifies that the game process starts and reaches a known state without crashing. This sounds trivial, but launch failures account for a surprising percentage of build problems. A renamed scene, a deleted autoload script, or a shader compilation error on a specific GPU can all prevent the game from starting.

The launch test starts the game process with a command-line argument that tells it to run in smoke test mode. The game initializes normally, reaches the main menu or a designated test scene, writes a success marker, and exits cleanly. If the process crashes before writing the marker, the test fails.

// Smoke test entry point - runs when --smoke-test flag is passed
func run_smoke_tests():
    var results = []

    // Test 1: Launch test - did we get here without crashing?
    results.append({
        "test": "launch",
        "passed": true,
        "time_ms": startup_time_ms()
    })

    // Test 2: Scene load tests
    var scene_results = run_scene_load_tests()
    results.append_array(scene_results)

    // Test 3: Duration test - run for 60 seconds
    var duration_result = run_duration_test(60)
    results.append(duration_result)

    // Write results and exit with appropriate code
    write_results("smoke_test_results.json", results)
    var failed = results.filter(func(r): return !r.passed)
    quit(0 if failed.is_empty() else 1)

Keep the launch test fast. It should complete in under 30 seconds. If your game takes longer than that to reach the main menu, consider creating a minimal test scene that loads only the core systems. The point is to verify that the engine initializes, scripts load, and the game loop starts — not to test gameplay.

Scene Load Tests

After the launch test passes, iterate through every critical scene in your game and verify that each one loads without errors. Critical scenes include the main menu, all gameplay levels, the settings screen, the inventory or shop, and any scene that is referenced from another scene. A missing texture, a broken node path, or a script error that only triggers during a specific scene’s _ready() function will be caught here.

The test loads each scene, waits for it to finish initializing, checks the error log for any errors or warnings that occurred during loading, and records the result. If a scene fails to load or produces errors, log the scene name and the specific error so the developer knows exactly what to fix.

Maintain a list of scenes to test rather than discovering them at runtime. Automatic scene discovery sounds convenient, but it can include test scenes, work-in-progress scenes, and editor-only scenes that are not meant to be loaded in a build. A curated list ensures you test what players will actually encounter and avoids false positives from development artifacts.

Scene load tests also catch dependency problems that the compiler misses. A scene might reference a resource file that exists in the editor but was excluded from the export configuration. The build compiles without error, but the scene crashes at runtime when it tries to load the missing resource. Running scene load tests on the exported build — not in the editor — is essential for catching these issues.

The No-Crash Duration Test

Some bugs only manifest after the game has been running for a few seconds. Memory leaks that crash on the second garbage collection cycle, timers that fire after initialization completes, or deferred loading systems that fail asynchronously will all pass a launch test but crash shortly after. The duration test catches these by letting the game run in a gameplay scene for 60 seconds.

Load a representative gameplay scene, enable basic simulated input — a character moving in a random pattern, a camera rotating, menus opening and closing — and let the game run. If the process is still alive and responsive after 60 seconds, the test passes. If the process crashes, hangs, or stops producing frames, it fails.

Sixty seconds is long enough to catch most deferred initialization bugs without making the CI pipeline unacceptably slow. If your game has loading sequences that take longer than 60 seconds, extend the duration accordingly. Some teams run a short duration test (60 seconds) on every commit and a longer test (10 minutes) on nightly builds.

Monitor memory usage during the duration test. If memory grows linearly over 60 seconds, you likely have a leak that will crash the game during longer play sessions. Log peak memory at the start and end of the test and fail if the delta exceeds a threshold. This turns the smoke test into a basic memory regression detector.

Batch Mode and Headless Execution

CI servers typically do not have GPUs or displays. Running game smoke tests on CI requires batch or headless mode, where the engine runs without creating a window or rendering to screen. Most game engines support this natively.

# Unity batch mode
./GameBuild -batchmode -nographics -executeMethod SmokeTests.Run -logFile smoke.log

# Godot headless mode
./game.x86_64 --headless --smoke-test

# Unreal Engine
./GameBuild.exe -nullrhi -nosplash -ExecCmds="automation RunTests SmokeTests; quit"

# CI pipeline exit code check
if [ $? -ne 0 ]; then
    echo "Smoke tests FAILED - blocking deployment"
    exit 1
fi

Headless mode disables rendering but keeps the game logic running. This means your smoke tests can verify script execution, scene loading, system initialization, and game state management — everything except visual output. For rendering-specific tests, you need a CI runner with a GPU or a virtual framebuffer, which adds complexity. Start with headless smoke tests and add GPU-backed tests later if visual regressions become a significant problem.

Use distinct exit codes for different failure types. Exit code 0 means all tests passed. Exit code 1 means a crash occurred. Exit code 2 means a scene failed to load. Exit code 3 means the duration test timed out. Distinct codes let your CI pipeline report the specific failure type in notifications, saving developers the step of reading the full log to figure out what went wrong.

CI Pipeline Integration

Place the smoke test step immediately after the build step in your CI pipeline. Build the game, then run the smoke tests on the freshly built binary. If the smoke tests fail, block all downstream steps: do not upload the build to Steam, do not notify QA, do not deploy to playtest servers. A build that cannot survive 60 seconds of automated testing should never reach a human tester.

Store smoke test results as CI artifacts. Save the results JSON file, the game log, and any crash dumps produced during the test. When a smoke test fails, the developer should be able to download these artifacts and reproduce the problem locally without running the full CI pipeline again.

Configure notifications so the team knows immediately when a smoke test fails. A Slack or Discord message with the commit hash, the failing test name, and a link to the CI artifacts gets the right person investigating within minutes. In Bugnet, you can configure webhooks to automatically create a bug report when a CI smoke test fails, pre-populated with the build version, failure type, and log output.

“Our smoke tests run in under two minutes per build. In the first month, they caught seven builds that would not launch, three scenes with missing references, and one memory leak that would have crashed the game after ten minutes of play. None of those issues reached a player.”

Related Resources

For identifying unreliable tests in your pipeline, see how to identify flaky tests in your game CI pipeline. To learn about tracking crash rates across versions, read how to track player session crashes across versions. For a broader QA strategy, check out automated QA testing for indie game studios.

Add a launch test to your CI pipeline today. It takes thirty minutes to set up and catches the most embarrassing class of bugs: the build that does not start.