Quick answer: Confirm the user is signed in via IOnlineIdentity::GetLoginStatus. Verify DefaultPlatformService in DefaultEngine.ini matches your intended backend. Use LogOnline log category for diagnostic output.
A multiplayer prototype calls SessionInterface->CreateSession(0, NAME_GameSession, Settings). The callback fires with bWasSuccessful = false. No specific error returned. The Steam overlay is running; the build is on the Steam version; settings look fine.
OnlineSubsystem Quick Audit
Before CreateSession can succeed:
- The OnlineSubsystem is loaded and active.
- The local user is signed in (identity layer).
- The session settings are valid.
- The backend (Steam, Epic) is reachable.
Failure in any layer aborts session creation.
Step 1: Verify Subsystem Config
DefaultEngine.ini should contain:
[OnlineSubsystem]
DefaultPlatformService=Steam
[OnlineSubsystemSteam]
bEnabled=true
SteamDevAppId=480
GameServerQueryPort=27015
[/Script/OnlineSubsystemSteam.SteamNetDriver]
NetConnectionClassName="OnlineSubsystemSteam.SteamNetConnection"
[/Script/Engine.GameEngine]
+NetDriverDefinitions=(DefName="GameNetDriver",DriverClassName="OnlineSubsystemSteam.SteamNetDriver",DriverClassNameFallback="OnlineSubsystemUtils.IpNetDriver")
For Steam specifically. For Epic, replace with EpicOnlineServices. SteamDevAppId 480 is the Spacewar test app — works for development; replace with your real app ID before shipping.
Step 2: Identity Check
IOnlineSubsystem* OSS = IOnlineSubsystem::Get();
if (!OSS) {
UE_LOG(LogTemp, Error, TEXT("OSS not loaded"));
return;
}
IOnlineIdentityPtr Identity = OSS->GetIdentityInterface();
if (Identity->GetLoginStatus(0) != ELoginStatus::LoggedIn) {
UE_LOG(LogTemp, Error, TEXT("Not logged in to OSS identity"));
return;
}
If LoginStatus is NotLoggedIn or UsingLocalProfile, CreateSession will fail. Trigger login via the platform’s flow before any session API.
Step 3: Session Settings Sanity
FOnlineSessionSettings Settings;
Settings.NumPublicConnections = 4; // at least 1
Settings.bShouldAdvertise = true;
Settings.bAllowJoinInProgress = true;
Settings.bIsLANMatch = false;
Settings.bUsesPresence = true;
Settings.bAllowJoinViaPresence = true;
NumPublicConnections = 0 silently fails on Steam. Set to your intended player count.
Step 4: Read the OSS Logs
Set LogOnline and LogOnlineSession to Verbose in DefaultEngine.ini:
[Core.Log]
LogOnline=Verbose
LogOnlineSession=Verbose
Rerun. Output Log shows the specific failure reason: “identity not logged in”, “subsystem disabled”, “invalid settings”. Read and act accordingly.
Steam-Specific Trap: Wrong Process
If you launch the editor from explorer instead of through Steam, the Steam SDK initialization may fail. Either:
- Launch through Steam (using a steam_appid.txt file for development).
- Or place a
steam_appid.txtin the editor binary directory with your app ID.
Verifying
Add comprehensive logging around session calls. The OnCreateSessionComplete should fire with bWasSuccessful = true within a few seconds. If still failing, the verbose log will pinpoint the layer.
Understanding the issue
AI bugs are emergent. The code is correct in isolation; the behavior emerges from interaction with other systems. Reproducing means controlling the interaction; fixing means deciding which interaction was wrong.
The specific bug described above is the kind that surfaces during integration rather than unit testing. It depends on a combination of factors: the asset configuration, the runtime state, the platform's specific behavior. In isolation, each piece looks correct; in combination, the bug emerges. This is why thorough integration testing - playing the actual game in realistic conditions - catches things that automated tests miss.
Why this happens
The triage path for this kind of bug is long. The symptom appears in gameplay, but the cause is in a different system. The reporter describes the gameplay effect; the engineer has to translate that into a hypothesis about the underlying cause. Misdirection is common.
At the engine level, the behavior comes from a deliberate design decision in Unreal. The engine team chose a particular trade-off - usually performance versus convenience, or generality versus specificity - and that trade-off has consequences when you push against it. Understanding the trade-off is what turns 'this bug is mysterious' into 'this bug is the expected consequence of this design'.
Verifying the fix
After applying the fix, the verification step has three parts: confirm the original repro is resolved, confirm no obvious regressions in adjacent functionality, and (for shipping titles) deploy to a small player cohort first and watch the crash and report rates. Each step catches something the others miss.
Reproducibility is the prerequisite for verification. If you can't reliably reproduce the bug pre-fix, you can't reliably verify it post-fix. Spend time getting a clean reproduction before you write any fix code. The fix is fast once you understand the reproduction; the reproduction is the slow part.
Variations to watch for
There's almost always a less obvious case where the same problem applies. The reported case is the one a player hit; the related cases hide because they're rarer or affect fewer players. After fixing the reported case, search the codebase for the pattern - one fix often unlocks several.
Adjacent bugs often share a root cause. After fixing the case you've found, spend an hour searching the codebase for similar patterns. What's the same call with different arguments? The same data flow with a different entity type? The same lifecycle issue in a sibling system? Each match is a candidate for the same fix, or a related fix that prevents future bugs of the same class.
In production
For shipping titles with a long support window, watch for this issue resurfacing after dependency updates. Engine upgrades, driver updates, OS releases - each one can resurface a bug class you thought you'd fixed because the underlying behavior changed slightly. Regression tests catch the obvious ones; player reports catch the rest.
When triaging a similar issue in production, prioritize gathering data over hypothesizing causes. A player report describes a symptom; what you need is a build SHA, a session timestamp, and ideally a screen recording or session replay. With those, the bug becomes tractable. Without them, you're guessing at hypothetical reproductions that may not match what the player actually hit.
Performance considerations
Performance implications matter when this bug class scales with player count or asset count. A bug that fires once per session is annoying; a bug that fires once per frame compounds. After fixing, profile the affected code path under realistic load. The fix that's correct for one entity may be too slow for ten thousand.
Diagnostic approach
Diagnosing this class of bug benefits from a structured approach: confirm the symptom, isolate the variables, hypothesize the cause, and verify the hypothesis before writing fix code. Skipping the isolation step is the most common mistake; without it, fixes often address symptoms while the underlying cause continues to produce other variations.
For Unreal-specific diagnostics, the editor's profiler is the canonical starting point. Capture a representative frame with the symptom present; compare against a frame without the symptom; the diff often points directly at the cause. If the symptom is non-deterministic, capture multiple frames and look for the pattern - the cause is usually a state transition or a specific input value rather than a continuous effect.
Tooling and ecosystem
Modern engine versions ship better tooling for this kind of issue than older versions. If you're on an older release, the diagnostic step may take significantly longer because the tools you'd want don't exist yet. Sometimes the right answer is upgrading rather than fighting through limited tooling.
Within Unreal, the relevant diagnostic surfaces include the standard frame debugger, memory profiler, and engine-specific debug overlays. Each one shows a different facet of what's happening. The frame debugger reveals draw call ordering and state transitions; the memory profiler shows allocation patterns; the debug overlay reveals per-system state. Bugs that resist one tool usually surrender to another - the trick is knowing which tool to reach for first.
Edge cases and pitfalls
Boundary conditions deserve specific testing attention. What happens when the input is zero, maximum, negative, or NaN? What happens at the start of a session vs hours in? What happens at the boundary between two systems handling the same data? These are where bugs hide and where regression tests are most valuable.
When writing a regression test for this fix, focus on the boundary conditions that surfaced the original bug. Tests that exercise the happy path catch obvious regressions; tests that exercise the boundary catch the subtler regressions that look like new bugs but are really the original returning. The latter are the tests that earn their keep over the long life of the project.
Team communication
Document the fix and its rationale in the commit message or attached engineering doc. Future engineers will encounter related issues; the rationale tells them whether your fix is reusable or specific to the case at hand. Without rationale, the fix gets reverted or copied incorrectly.
If this fix touches a system several engineers work in, a short writeup in the team's engineering channel helps. Not a full design doc - a paragraph explaining what was wrong, what's fixed, and what to watch for. Future engineers encountering similar symptoms will search for the fix; making it findable is a small investment that pays back later.
“OnlineSubsystem failures cascade through layers. Identity, settings, subsystem, backend — check each one.”
Wire a debug UI showing OSS status (subsystem name, identity status, session count) — saves hours diagnosing in builds.