Why do players say notes feel late when my judgment code is correct?

Because correctness in code does not account for hardware latency. Audio output buffers, input polling and the display all add delay before and after your engine sees a tap. If you judge against the audio clock without applying measured latency and the player's calibration offset, every high-latency device reports being marked late.

What should I capture when someone reports a timing problem?

Capture the calibration offsets in effect, audio buffer size and device, any measured round-trip latency, the beatmap and exact song position, and the recent average hit offset. That numeric profile tells you whether it is constant latency, calibration drift, input jitter or a chart sync issue without guessing.

Why does a song feel fine at the start but drift by the end?

That is accumulating error, usually a beatmap authored against a slightly different audio file or an unhandled tempo change, or an audio clock that stuttered during a load. A constant offset feels wrong from the first note, so progressive drift points at the chart or the playback clock, not your judgment window.

Bug Tracking for Rhythm and Music Games

Quick answer: Rhythm games fail in milliseconds. Audio output latency, input polling delay and a wrong calibration offset all show up as notes that feel late even though your judgment math is correct. To track these bugs you need the player's calibration values, audio device and buffer size, measured input and output latency, and the beatmap and section where it drifted. Without those numbers, off feels untriageable.

A rhythm game is a contract about time. The note hits the line, the player taps, and the game must agree on whether that was perfect within a window measured in milliseconds. When that contract breaks, players do not file a stack trace, they say it feels off or the song drifts near the end. Behind that vague complaint is a precise failure: audio latency, an input poll arriving a frame late, a calibration offset that was never applied, or a beatmap whose timestamps drift against the actual audio. This post is about turning feels off into numbers you can fix.

Latency is the whole game

Every rhythm game fights a latency budget it does not fully control. Audio output latency depends on the device buffer size and the platform mixer, and it can swing from a few milliseconds on a wired setup to over a hundred on a Bluetooth headset. Input latency stacks on top: controller polling, USB hubs, and the display itself all add delay before your engine even sees the tap. If you judge timing against the audio clock without accounting for these, every player on a laptop speaker is judged as if their hardware were instant, and they all report being marked late.

The fix is to measure and store these numbers, not assume them. Capture the audio buffer size and sample rate, the output device type, and any measured round-trip latency from your calibration step. When a player reports that hits feel late, you can immediately see whether they are on a high-latency Bluetooth path and whether their calibration offset matches. A timing complaint without the latency profile is unfalsifiable. With it, you can tell a real judgment bug from a player who simply never ran calibration on a slow device.

Calibration offsets and where they go wrong

Calibration is the player's personal correction for their hardware, usually two offsets: one for visual sync and one for audio or input sync. The bugs cluster around how and when those offsets are applied. A classic failure is applying the audio offset in one code path but not in the practice mode, so the song feels tight in a real run and loose in practice. Another is storing the offset in milliseconds but reading it as samples, scaling the error by the sample rate. These produce reports that are maddeningly inconsistent unless you log the actual offset values in effect at the time.

Calibration also drifts across updates. If you change how the offset is signed or where in the pipeline it applies, every returning player's stored value is now subtly wrong, and you get a wave of complaints right after a patch. Version your calibration data and capture it with every report, including which offsets were active and the build that wrote them. That lets you spot a regression where an entire cohort that calibrated on an older version is suddenly off by a fixed amount, which points straight at the pipeline change rather than at the players.

Beatmap and sync drift

Some bugs live in the chart, not the engine. A beatmap whose note timestamps were authored against a slightly different audio file, or against an assumed constant tempo when the track actually has a tempo change, will feel fine at the start and progressively wrong as the song goes. Players describe this as the song drifting near the end, which is the signature of accumulating error rather than a constant offset. To catch it you need to know the exact beatmap, the song position where it felt wrong, and ideally the difference between the audio clock and the chart's expected position at that point.

Streaming and looping introduce their own drift. If your audio clock comes from the playback position and that resets on a loop or stutters during a load, the chart and the music desync without any code being wrong in isolation. Capture the audio clock source and whether a buffer underrun or load hitch occurred during the section. A drift report tied to a specific song timestamp and a logged underrun is a solved bug. The same report with only feels off after the chorus could take a week of blind guessing.

Turning feels off into data

The discipline that makes rhythm bugs tractable is converting subjective complaints into recorded measurements at the moment of the hit. When a player flags a section, capture the judgment window settings, the running average hit offset over the last several notes, the audio clock versus the system clock, and the calibration in effect. A consistently positive average offset means everything is judged late, which is a latency or calibration problem. An average near zero with high variance means input jitter or frame pacing. The shape of the offset distribution tells you the cause before you read a single line of code.

Group these reports by their numeric signature rather than by the player's words. Many players saying late on Bluetooth devices with a positive offset are one bug. A scattered set of high-variance reports on a single song's bridge is another. By attaching the timing telemetry to every report, you replace argument with arithmetic. You stop debating whether the game feels good and start reading whether the average offset for a device class is within your window, which is the only question that actually has an answer.

Setting it up with Bugnet

Bugnet's in-game report button is ideal for rhythm games because you can bind it to fire at the exact note the player flubbed and attach the timing context as custom fields. Send the calibration offsets, audio buffer size and device, measured latency, current beatmap, song position, and the recent hit-offset average. If the audio thread crashes or a buffer underrun throws, the crash report arrives with a stack trace and full device and platform details. A complaint that used to be feels off near the end becomes a record showing a forty millisecond drift at bar one hundred on a specific chart.

Occurrence grouping then collapses the noise. The hundreds of late reports from Bluetooth users fold into one issue with a big count, and you can filter by the device or latency custom field to confirm it is hardware-class specific rather than a judgment bug. Everything sits in one dashboard, so triaging timing issues becomes sorting by occurrence count and reading the attached numbers. You spend your time on the drift that hits real charts for real cohorts instead of on whichever player complained loudest in your community channel.

Calibration-first testing and shipping

Bake timing verification into your test process the way you would bake in a build step. Keep a reference rig with known, measured latency and assert that the average hit offset for an automated perfect-tap bot stays inside your window across every chart after each change. Add a test that loops a song and checks the chart stays synced past the loop point. Most timing regressions are introduced silently by a refactor of the audio pipeline, and a numeric assertion catches them before a single player feels the drift.

Make calibration a first-class, hard-to-skip part of onboarding, and log when players skip it, because skipped calibration is the root of a large share of timing complaints. Treat the captured offset distributions as a living health metric for the game's feel rather than as a pile of complaints. Shipped well, a rhythm game gives every player a tight, fair window on their own hardware, and your bug tracking quietly proves it by showing average offsets sitting comfortably inside the judgment line song after song.

In a rhythm game, feels off is always a number. Capture the latency, calibration and hit offset and the bug stops being an argument.