What is automated playtesting and why do indie games need it?

Automated playtesting uses scripts, bots, or replay systems to run through your game without a human player. Indie teams need it because manual QA does not scale with small teams. Automated tests catch regressions in level loading, physics, UI flow, and performance every time you push a build, freeing your limited testers to focus on subjective feedback like game feel and balance.

How do you record and replay gameplay sessions for testing?

Record every player input with a frame or tick number into a log file. On replay, feed those inputs back into the game at the correct frame. For deterministic games, the replay will reproduce the exact same session. For non-deterministic games, record both inputs and key state snapshots so you can verify outcomes even if the exact simulation diverges slightly.

Can you run automated game tests in a CI pipeline?

Yes. Both Unity and Godot support headless or command-line builds that can run in CI environments like GitHub Actions. You can execute unit tests, integration tests, and smoke tests that boot the game, load each scene, and verify no crashes or errors occur. Performance benchmarks can also run in CI by measuring frame times over a fixed replay and failing the build if they exceed a threshold.

How to Set Up Automated Playtesting for Indie Games

Quick answer: Automated playtesting uses scripts, bots, or replay systems to run through your game without a human player. Indie teams need it because manual QA does not scale with small teams.

Learning how to set up automated playtesting for indie games is a common challenge for game developers. Manual playtesting is essential for game feel, but it cannot catch every regression in every build. When a Tuesday commit breaks level three’s boss door and nobody notices until Friday’s playtest session, you have lost three days. Automated playtesting catches those regressions within minutes. This guide covers how to build input recording and replay systems, write automated smoke tests, set up performance benchmarks, and integrate everything into a CI pipeline — with concrete examples for both Unity and Godot.

Why Automated Playtesting Matters for Small Teams

Large studios have dedicated QA departments that run through test plans every day. Indie teams do not. You might have two developers who playtest their own work, or a handful of Discord volunteers who test when they feel like it. That is not enough coverage to catch regressions consistently.

Automated playtesting fills the gap by running deterministic checks on every build. It does not replace human testers — machines cannot judge whether a jump feels satisfying or a puzzle is too hard. But machines are excellent at verifying that the jump still works, the puzzle is still solvable, and the game does not crash when loading level four. By automating the mechanical checks, you free your human testers to focus on the subjective work that actually requires a human.

The investment is modest. A basic smoke test suite takes a day to set up. A replay system takes two or three days. The return is measured in bugs caught before they reach players, and in the confidence to ship builds quickly.

Recording Gameplay Sessions

The foundation of automated playtesting is the ability to record a gameplay session and replay it later. The simplest approach is input recording: capture every player input along with the frame or tick number when it occurred, then feed those inputs back during replay.

In Unity, you can hook into the Input System to record actions:

using System.Collections.Generic;
using System.IO;
using UnityEngine;

public class InputRecorder : MonoBehaviour
{
    private List<InputFrame> _frames = new List<InputFrame>();
    private int _frameCount = 0;
    private bool _recording = false;

    [System.Serializable]
    public struct InputFrame
    {
        public int frame;
        public float moveX;
        public float moveY;
        public bool jump;
        public bool attack;
    }

    public void StartRecording()
    {
        _frames.Clear();
        _frameCount = 0;
        _recording = true;
    }

    void FixedUpdate()
    {
        if (!_recording) return;

        _frames.Add(new InputFrame
        {
            frame = _frameCount,
            moveX = Input.GetAxisRaw("Horizontal"),
            moveY = Input.GetAxisRaw("Vertical"),
            jump = Input.GetButton("Jump"),
            attack = Input.GetButton("Fire1")
        });
        _frameCount++;
    }

    public void SaveRecording(string path)
    {
        var json = JsonUtility.ToJson(new RecordingData { frames = _frames });
        File.WriteAllText(path, json);
        _recording = false;
    }
}

In Godot 4, the same concept uses the built-in Input singleton:

class_name InputRecorder
extends Node

var _frames: Array[Dictionary] = []
var _tick: int = 0
var _recording: bool = false

func start_recording() -> void:
    _frames.clear()
    _tick = 0
    _recording = true

func _physics_process(_delta: float) -> void:
    if not _recording:
        return

    _frames.append({
        "tick": _tick,
        "move_x": Input.get_axis("move_left", "move_right"),
        "move_y": Input.get_axis("move_up", "move_down"),
        "jump": Input.is_action_pressed("jump"),
        "attack": Input.is_action_pressed("attack")
    })
    _tick += 1

func save_recording(path: String) -> void:
    var file := FileAccess.open(path, FileAccess.WRITE)
    file.store_string(JSON.stringify(_frames))
    file.close()
    _recording = false

Record inputs in FixedUpdate (Unity) or _physics_process (Godot) rather than in the regular update loop. Physics ticks are deterministic at a fixed rate, which makes replays reproducible. If you record in the variable-rate render loop, frame rate differences between recording and playback will cause the replay to drift.

Building a Replay System

The replay system reads a recorded input log and feeds each frame’s inputs back into the game. During replay, you override the normal input system so the game reads from the log instead of from the keyboard or controller.

The key requirement is that your game logic reads input through an abstraction layer rather than directly from the hardware. Instead of calling Input.GetButton("Jump") throughout your codebase, call a wrapper method that returns either live input or recorded input depending on the current mode. This is a small refactor with a large payoff: it enables not just automated testing but also demo recording, tutorial ghosts, and spectator replays.

During replay, compare the resulting game state against expected checkpoints. For example, after replaying 500 frames of a level-one speedrun, verify that the player’s position is within a tolerance of the expected position and that the level completion flag is set. If the replay diverges from expectations, something in the game logic has changed — and you have found a regression.

Automated Smoke Tests

Smoke tests answer one question: does the game boot and not crash? They are the lowest-effort, highest-value automated tests you can write. A smoke test loads each scene in your game, waits a few seconds, checks for error logs, and moves on.

In Unity, you can write smoke tests using the Unity Test Framework:

using System.Collections;
using NUnit.Framework;
using UnityEngine;
using UnityEngine.TestTools;
using UnityEngine.SceneManagement;

public class SmokeTests
{
    private static readonly string[] Scenes = new[]
    {
        "MainMenu", "Level1", "Level2", "Level3", "BossArena"
    };

    [UnityTest]
    public IEnumerator AllScenesLoadWithoutErrors(
        [ValueSource(nameof(Scenes))] string sceneName)
    {
        SceneManager.LoadScene(sceneName);
        yield return null; // wait one frame for load

        // Verify the scene actually loaded
        Assert.AreEqual(sceneName,
            SceneManager.GetActiveScene().name);

        // Let it run for 120 frames (~2 seconds at 60fps)
        for (int i = 0; i < 120; i++)
            yield return null;

        // If we got here without exceptions, the scene is stable
        Assert.Pass();
    }
}

In Godot, you can use GDScript with the GdUnit4 framework or write a simple autoload that iterates through scenes:

extends SceneTree

var _scenes: Array[String] = [
    "res://scenes/main_menu.tscn",
    "res://scenes/level_1.tscn",
    "res://scenes/level_2.tscn",
    "res://scenes/level_3.tscn",
    "res://scenes/boss_arena.tscn"
]

func _init() -> void:
    var failed: Array[String] = []
    for scene_path in _scenes:
        var scene := load(scene_path) as PackedScene
        if scene == null:
            failed.append(scene_path + ": failed to load")
            continue
        var instance := scene.instantiate()
        root.add_child(instance)
        # Let it run for 120 physics frames
        for i in 120:
            await process_frame
        instance.queue_free()
        print("PASS: " + scene_path)

    if failed.size() > 0:
        for msg in failed:
            printerr("FAIL: " + msg)
        quit(1)
    else:
        quit(0)

Run smoke tests after every build. They take seconds, and they catch the most common regression: a scene that no longer loads because someone moved an asset, deleted a node, or broke a script reference.

Performance Benchmarking

Automated performance benchmarks detect frame rate regressions before players do. The approach is straightforward: replay a recorded session (or run a scripted bot through a known heavy area) and measure frame times throughout. Store the results and compare them against the previous build.

Measure Time.unscaledDeltaTime (Unity) or Performance.get_monitor(Performance.TIME_PROCESS) (Godot) every frame during the benchmark. Compute the average, 95th percentile, and worst-case frame times. If any metric exceeds a threshold — say, the 95th percentile frame time rises above 20ms when it was previously 14ms — flag the build as a performance regression.

Store benchmark results in a simple JSON or CSV file that you commit to your repository or upload as a CI artifact. Over time, this gives you a performance history that reveals trends. A slow creep from 12ms to 16ms average frame time over ten builds is invisible in any single comparison but obvious in a graph.

CI Integration

The final step is running your tests automatically on every push. Both Unity and Godot support headless execution, which means they can run in CI environments without a GPU (though GPU-dependent tests will need a runner with graphics support).

For Unity with GitHub Actions, use GameCI’s action to run EditMode and PlayMode tests:

# .github/workflows/test.yml
name: Game Tests
on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: game-ci/unity-test-runner@v4
        with:
          projectPath: .
          testMode: PlayMode
          unityVersion: auto
        env:
          UNITY_LICENSE: ${{ secrets.UNITY_LICENSE }}

For Godot, run the engine in headless mode with your test script:

# .github/workflows/test.yml
name: Game Tests
on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Install Godot
        run: |
          wget -q https://github.com/godotengine/godot/releases/download/4.4-stable/Godot_v4.4-stable_linux.x86_64.zip
          unzip Godot_v4.4-stable_linux.x86_64.zip
      - name: Run smoke tests
        run: |
          ./Godot_v4.4-stable_linux.x86_64 --headless --script tests/smoke_test.gd

Set your CI to block merges when tests fail. This is the enforcement mechanism that makes automated testing actually work. Without it, failing tests become background noise that everyone ignores. With it, regressions get fixed before they reach the main branch.

Practical Tips for Indie Teams

Start small. You do not need a comprehensive test suite on day one. Begin with smoke tests that load every scene — this alone catches a surprising number of regressions. Add replay-based tests for your most fragile systems (boss fights, cutscenes, level transitions). Add performance benchmarks when frame rate becomes a concern.

Keep your test recordings short. A 30-second replay that covers the critical path through a level is more valuable than a 10-minute recording that wanders around. Short replays run faster in CI and are easier to update when the level changes.

Version your recordings alongside your code. When a level changes enough that old replays break, record a new one. This is maintenance overhead, but it is far less than the overhead of manual regression testing. Treat broken replay tests the same way you treat broken unit tests: fix them before merging.

Related Issues

For more on catching bugs early, see our guide on building a QA checklist for game release. If you are tracking performance regressions specifically, our article on tracking game performance issues across devices covers device-specific benchmarking. For setting up automated crash detection alongside playtesting, check how to set up crash reporting for your indie game.

A five-minute smoke test suite running on every push will catch more regressions than a weekly four-hour playtest session. Automate the boring checks so your testers can focus on the interesting ones.