Quick answer: Text adventure bugs cluster in two places: the parser that turns typed input into a command, and the world model that the command mutates. Track the raw input string, how the parser tokenized it, which verb and noun it resolved to, and the world state before and after. Unrecognized commands are gold, since they show the vocabulary players expect but you did not implement.

In a text adventure the player types into a void and trusts that your parser understands them. When it does not, they do not file a neat bug report, they just rephrase, get frustrated, and quit. The two failure surfaces are distinct: the parser can mishandle the words, resolving the wrong noun or rejecting a reasonable synonym, or the world model can be wrong, letting them take an item that should be fixed or describing a room that no longer matches reality. Tracking both means capturing the raw input and the world state around every command, so you can see what players meant and what your game did with it.

The parser is your most error-prone surface

Every typed line passes through tokenization, vocabulary lookup, disambiguation, and verb resolution before anything happens in the world. A bug at any stage produces the same flat failure: the player sees a canned line about not understanding, and you have no record of why. Maybe the verb was a synonym you forgot, maybe two nouns in scope collided and disambiguation picked wrong, maybe an adjective was ignored. The visible output erases all of that, so the parser is simultaneously your highest-traffic and least observable component.

The fix is to log the parse, not just the result. For each input, record the raw string, the token list, the candidate verbs and nouns considered, the one chosen, and the rule that fired. When a player reports that examining the painting did nothing, the log might show the parser bound painting to a scenery object in the wrong room because it was still in scope after a move. That is a one-line scope fix, but only if you can see the resolution path the parser took.

World state before and after every command

A text adventure command is a transition: it reads the world model and writes a new one. Bugs in the model show up as states that should be impossible, an item in two rooms at once, a container that is both open and locked, an NPC present after they should have left. To catch these you want a snapshot of the relevant object tree immediately before and after the command that produced the complaint, so you can see exactly which property changed and whether it changed to a legal value.

Diffing those two snapshots is the fastest debugging move in the genre. If the player says taking the lamp left them holding nothing, the diff shows the lamp's location was set to the player but its carried flag never updated, so the inventory query skips it. Without the before-and-after you would be reading code and guessing; with it you read two short state dumps and the bug points at itself. Make the world model dumpable as text and most model bugs become trivial to localize.

Unrecognized commands are a feature request log

The inputs your parser rejects are the most valuable data your game produces. When dozens of players type pull rope, climb wall, or light torch and get nothing, they are telling you the vocabulary they expect the world to support. Some of those are real bugs where the action should work and a synonym is missing. Others are content gaps where players reasonably believe a verb should do something. Either way, the rejected-input stream is a direct line into the gap between your world and the player's mental model of it.

Aggregate them. A single rejected command is noise, but the same phrase rejected across many sessions in the same room is a clear signal ranked by frequency. Add the common synonyms, implement the actions players keep reaching for, and write better refusal messages for the ones you deliberately do not support. Over a few updates this turns a parser that feels brittle into one that feels like it anticipated the player, which is the entire illusion the genre depends on.

Disambiguation, scope, and pronoun bugs

The subtle parser bugs live in scope and reference. Scope decides which objects are reachable for a command, and getting it wrong means the player can examine things in another room or cannot reach something right in front of them. Pronouns add a second layer: it and them refer back to the last noun, and if that binding is stale, take it picks up the wrong object entirely. These bugs are maddening because they depend on conversational history, so they never reproduce from a cold start.

Capturing the recent command history alongside the failing input is what makes them tractable. With the last several inputs and their resolved nouns, you can see that it was bound three commands ago and never refreshed when the player moved. Tracking the pronoun-resolution chain and the scope set at parse time turns a ghost into a clear sequence. These are the bugs that distinguish a parser that feels alive from one that feels like it has amnesia, so they are worth the logging effort.

Setting it up with Bugnet

Bugnet's in-game report button can fire from inside your game loop with the parser and world context already attached. Push the raw input, the token list, the resolved verb and noun, and a dump of the nearby object tree into custom fields, and route the recent command history through player attributes. A report then arrives not as the player saying it did not work, but as the exact line they typed, how the parser read it, and the world state it acted on, which is everything you need to reproduce without a single follow-up question.

Because the same parser gap gets hit by many players, Bugnet's occurrence grouping folds identical rejected inputs into one issue with a count. That count ranks your missing vocabulary by demand, so you implement pull rope before some rare phrasing one person tried. Filtering by room or by the verb that failed lets you cluster bugs around a specific scene or a specific handler, turning a flood of one-line parser complaints into a prioritized, reproducible backlog in one dashboard.

Testing the parser like the program it is

A parser is a program with a near-infinite input space, so test it like one. Build a transcript suite: a list of input lines paired with the expected output and world state, replayed on every build. Seed it with every bug you fix so regressions cannot creep back, and add the real rejected commands your players produce as new test cases. A transcript that diffs cleanly is the closest a text adventure gets to a green test bar, and it catches the subtle scope and synonym regressions that manual play misses.

Beyond automated transcripts, watch how real players phrase things and let that reshape your grammar. Indie text games win or lose on the feeling that the world understands intent, and that feeling is built one synonym and one fixed transition at a time. Capture the raw inputs, diff the world state, aggregate the rejections, and the parser stops being a black box. The result is a game that meets players where they type, which is the whole promise of the form.

In a text adventure the bug is either in how the parser read the words or in how the command changed the world. Capture both and the fix is usually obvious.