Quick answer: Normalize every platform’s stack format into a common representation of (module, function, offset) tuples, strip platform-specific noise (runtime frames, crash handlers, inlined helpers), then hash the top N user-code function names. Symbol servers resolve release-build frames at ingestion time. Done right, the same bug produces the same fingerprint whether it crashed on Windows, Mac, Linux, PlayStation, or Android.

Cross-platform crash deduplication is harder than single-platform because every platform lies to you in its own way. Windows emits PDB-style symbols with one calling convention. Mac and Linux use DWARF. Consoles use proprietary formats you can only read with the platform SDK. Android gives you either Bionic C++ frames or JVM frames depending on where the crash happened. And every one of them inlines differently, so the “same bug” can produce ten different stack traces. If you hash the raw stacks, you get ten bugs. If you normalize first, you get one.

Parse Every Format into a Common Representation

The first step is a parser per platform that turns a raw stack into a list of StackFrame structs with the same fields: module name, function name (demangled), source file (optional), line number (optional), and offset. Everything downstream works on this common representation, which means only the parsers care about platform differences.

type StackFrame struct {
    Module   string
    Function string // demangled, no parameters
    File     string // optional
    Line     int    // optional
    Offset   uint64 // within function
    Inlined  bool
}

func ParseStack(platform string, raw []byte) ([]StackFrame, error) {
    switch platform {
    case "windows": return parsePDB(raw)
    case "macos", "linux": return parseDWARF(raw)
    case "android":  return parseAndroid(raw)
    case "playstation": return parsePS(raw)
    }
    return nil, fmt.Errorf("unknown platform %q", platform)
}

Normalize function names aggressively. Strip namespace prefixes consistently, collapse template parameters (std::vector<T> becomes std::vector), and fold anonymous namespaces. Without this, (anonymous namespace)::Foo::Bar on Linux will not match Foo::Bar on Windows even though they are the same function.

Strip Noise Before Fingerprinting

The top of a crash stack is usually noise: the OS crash handler, your signal handler, the runtime exception dispatcher, and sometimes allocator frames. These frames are stable across all crashes, so including them in the fingerprint makes every crash look similar. Skip them and start fingerprinting at the first frame of actual user code.

func FilterNoise(frames []StackFrame) []StackFrame {
    noise := regexp.MustCompile(
        `^(RaiseException|__cxa_throw|abort|__pthread_kill|` +
        `CrashHandler|SignalHandler|_CxxThrowException)$`)
    out := make([]StackFrame, 0, len(frames))
    for _, f := range frames {
        if noise.MatchString(f.Function) { continue }
        if f.Module == "kernel32.dll" || f.Module == "libc.so" { continue }
        out = append(out, f)
    }
    return out
}

Handle inlined frames with care. A function inlined on Windows may not be inlined on Mac, which means the “top frame” differs. If your symbol format exposes inline information, walk inlined frames as if they were real frames. This keeps fingerprints stable across compiler optimizations.

Build a Fingerprint That Survives Builds

The fingerprint is a hash of the first N filtered frames. Use function names only; addresses and line numbers change between builds and make the fingerprint worthless. N between 3 and 5 works for most crashes; deeper and you risk fingerprinting identical bugs differently because a helper function up the stack changed.

func Fingerprint(frames []StackFrame, crashType string) string {
    h := sha256.New()
    io.WriteString(h, crashType) // e.g. "segfault"
    depth := 5
    if len(frames) < depth { depth = len(frames) }
    for i := 0; i < depth; i++ {
        io.WriteString(h, frames[i].Function)
        io.WriteString(h, "|")
    }
    return hex.EncodeToString(h.Sum(nil))[:16]
}

Salt the fingerprint with the crash type (segfault, exception, assertion, deadlock). Two bugs with identical top frames but different crash mechanisms are different bugs. Without the salt, you will merge unrelated issues and waste time investigating one bug while the other continues to hurt players.

Hook Up a Symbol Server

Release builds ship stripped of symbols. Your ingestion pipeline needs to resolve raw addresses to function names before it can normalize and fingerprint. Do this by uploading symbols at build time to a symbol server keyed by module name and build ID (or UUID on Mac, GNU build ID on Linux). At ingestion, look up symbols by the ID embedded in the crash report.

Microsoft’s Symbol Server format works well for Windows and is understood by WinDbg. For Mac and Linux, store .dSYM bundles and debug symbol files in the same layout. Many studios build a single symbol store behind an HTTP endpoint that handles all platforms with the same lookup protocol.

Verify With a Canary Bug

Before you trust your deduplication in production, write a test that intentionally crashes on every platform and verify all the crash reports fingerprint to the same value. This is easy to forget and easy to get wrong. A null pointer dereference in the same function on all platforms should produce one bug, not five. If it produces five, your parser or your noise filter is wrong.

“We had a save system crash that showed up as twelve different bugs because the top frame was an inlined helper that the Windows compiler kept and the Clang build dropped. Once we walked inlined frames, all twelve collapsed into one — with 40,000 occurrences.”

Related Issues

For the classification step that happens before dedup, see how to build a player report classifier. For crash handling on specific platforms, read automated crash reporting for indie games.

Pick one unsymbolized crash from your tracker and try to fingerprint it by hand. The steps you fumble through are exactly what your pipeline needs to automate.