
The Phantom-Ship Audit
How a commit message that claimed a feature was shipped — when the code was never actually committed — led to a new audit pattern for multi-phase projects.
The discovery
We were at Phase 74 of a Pokémon ROM hack. The project had a detailed ledger — docs/build-artifacts.md — tracking every phase, its commit SHA, and what it claimed to ship.
Phase 15’s entry said: “VexRefuse now uses trainerbattle_single.”
This is the good-ending climax. The player fights the final boss. It’s load-bearing for the entire post-game.
I ran: git show <phase-15-commit> -- <file> | grep trainerbattle_single
It returned nothing. The function call was never added. The commit message and the ledger both claimed it shipped, but the code wasn’t there.
The entire good ending — the final boss battle, the victory sequence, the post-game content gated behind it — had been “completed” 59 phases ago without the actual battle ever being wired up.
Why reasoning-based reviews miss this
A deep-dive review that reads the design docs and checks the architecture will catch design-level bugs. But it accepts the commit message’s claim that the implementation matches the design.
The phantom-ship pattern is invisible to reasoning because the design is correct. The implementation doesn’t match the claim about the implementation. You need a mechanical check — grep for the symbol the commit claims to have added — not a reasoning check.
The audit-subagent pattern
For any project with N shipped phases accumulating commit messages that say “shipped X”:
- Read the ledger claim (what the commit message says it did)
- Grep the cited commit (
git show <commit> -- <file> | grep <cited-symbol>) - Empty result = false claim
I dispatch this as a leaf subagent running mechanical grep+read — no reasoning needed, just pattern matching. On the 1.18M-line codebase, 4 audits yielded 1 Critical + 1 High + 1 Medium finding in about 30-40 minutes.
The deep-dive review (design correctness) and the audit-subagent (implementation matches design) are complementary. Run both before any v1.0 milestone.