You've watched this happen. Maybe to your engineering team, maybe to yourself.
Someone opens Cursor or Claude Code or v0, types a few sentences, and twenty minutes later there's a working app. The first version is magic. The second version still works. Somewhere around the third or fourth iteration, things start to drift — the agent rewrites a function it wasn't supposed to touch, the styling breaks, the data model quietly changes, a feature you didn't ask for shows up, the thing you actually asked for doesn't.
You ship a fix. The fix introduces two new problems. By the time you've got it working again you've spent more time than if you'd written the original yourself, and you don't trust the codebase.
Most PMs I know have had this exact experience. They tried vibecoding, got a working prototype faster than they'd believed possible, and then watched it disintegrate. Most concluded one of two things: either the tools aren't ready yet, or they aren't ready yet — they should have learned to code properly first.
Both conclusions are wrong. The tools are fine. You don't need to learn to code. What you need is a brief the agent can actually execute against.
The real bottleneck isn't the agent
Here's the thing nobody is saying clearly enough: coding agents have gotten extraordinarily good at the execution layer. They write working code. They debug. They refactor. They pick reasonable libraries. They handle edge cases you wouldn't think to mention.
What they don't do is fill in gaps the way a human engineer would. When you hand a vague spec to a senior engineer, they push back. They ask questions. They tell you which decisions you haven't actually made yet. They notice when two parts of the requirement contradict each other. They quietly substitute good defaults where you haven't specified one. They remember what you said yesterday and apply it to today's work.
Coding agents do none of that reliably. They execute literally what's in front of them, in the order it's in front of them, with whatever defaults their training nudges them toward. If your brief is ambiguous, the agent picks an interpretation — and it might not be yours. If your brief contradicts itself, the agent picks one side — and it might not be the side you'd have chosen. If your brief leaves a decision implicit, the agent makes it for you — and you'll find out at v0.3, when the consequence is now load-bearing.
This is not a flaw in the agent. This is what literal execution looks like.
The bottleneck isn't downstream of the spec. The bottleneck is the spec itself. The agent is the most patient, fastest, cheapest engineer you've ever worked with. It's also the one most demanding of a clear brief, because it has the least context and the most willingness to confidently fill in your gaps.
You already know how to write specs
Here's the part most vibecoding content gets wrong. They treat the PM as a beginner — someone learning a new skill from scratch. You're not. You've been writing specs for years. You've been writing PRDs, user stories, acceptance criteria, edge case lists, design briefs, RFCs you didn't have to call RFCs. You know how to make engineering teams build the thing you actually wanted, most of the time.
The skill transfers. It transfers almost entirely. There's just a small delta.
The agent handoff doc is a PRD with three additions. That's the whole framework.
One: gating rules. Things the agent must check with you before doing. A PRD doesn't usually have these because human engineers default to asking when they're unsure. Agents default to deciding when they're unsure. So you tell them explicitly: stop and ask before adding any new dependency, before changing the data model, before touching anything in this list of files, before making any decision that affects more than one screen. Gating rules are how you reclaim the asking behaviour that human engineers do for free.
Two: what-NOT-to-do hard rules. The negatives. Most PRDs are written entirely in the positive — here's what the system should do. Agents need the negative space drawn explicitly. Don't use localStorage for this data. Don't add a backend. Don't introduce a build step. Don't import any UI library. Don't write tests yet. The negatives matter more than the positives because the positives only constrain what gets built; the negatives constrain what gets built around what gets built. Without them, the agent will pull in conventions and patterns you didn't ask for.
Three: escalation protocol. What the agent should do when it hits something genuinely ambiguous. PRDs handle this implicitly — the engineer slacks you. Agents need it spelled out: pause, document the ambiguity, propose two options, wait for your call. The temptation when you're vibecoding is to let the agent run autonomously because the speed is intoxicating. The escalation protocol is what keeps speed from turning into drift.
That's it. PRD plus three additions. Familiar shape, small delta, low cognitive cost.
A worked example
Let me show you a real one. Folio is an ebook reader I'm building — single HTML file, runs in the browser, no server, no accounts, handles EPUB and CBZ and MOBI, syncs reading progress through a sidecar file you control. The kind of thing engineering teams won't prioritise because it doesn't have a business case, but I want it for myself.
The handoff doc for Folio runs about 2,000 words. Most of it is the standard PRD content — vision, target experience, supported formats, file structure, persistence model. That part you already know how to write. Let me show you the parts that are new.
A few of Folio's gating rules:
- Stop and confirm before adding any third-party dependency. The project floor is vanilla JS, zero build step. New libraries break that floor.
- Stop and confirm before changing the library data model. The schema is referenced by sidecar files users will already have on disk; changes are migrations, not refactors.
- Stop and confirm before touching the FSA permission flow. It's the most fragile part of the app and the part users complain about most.
- Stop and confirm before adding any feature not listed in the v1 scope, even if it seems small. v1 scope creep is how this project doesn't ship.
Notice the pattern. Each rule names a specific decision class, names why the gate exists, and is short enough that the agent will actually remember to apply it. Vague gating rules ("ask before doing anything major") don't work. The agent's definition of "major" won't match yours.
A few of Folio's what-NOT-to-do entries:
- Do not use IndexedDB as the primary persistence layer for library state. OPFS is primary; IndexedDB mirrors for search.
- Do not implement a custom EPUB renderer. Use an existing library.
- Do not add reading statistics, gamification, social features, or recommendations. Folio is a reader, not a platform.
- Do not introduce any tracking, analytics, or telemetry, even error reporting. The app runs offline; behave like it.
- Do not write a service worker yet. Defer until v1.1.
Each of these is a decision an agent would otherwise make on its own. IndexedDB is the obvious default for browser storage, so the agent would reach for it. EPUB rendering is complex enough that the agent might try to roll its own. Reading apps "should" have stats, in some default sense, so the agent might suggest them. Telemetry is industry standard. Service workers are best practice. All of these are wrong for Folio, and saying so explicitly is the only way to keep them out.
Folio's escalation protocol, condensed:
When you hit a decision that's not covered by this brief, the gating rules, or the what-NOT-to-do list:
- Stop work.
- Write the ambiguity into a
QUESTIONS.mdfile at the project root.- Propose 2-3 options with trade-offs.
- Wait for the call before proceeding.
- Do not pick a default and continue.
The last line is the important one. Without it, agents will note the ambiguity, pick a reasonable-looking option, mention it in passing, and barrel forward. By the time you read the note, the choice is load-bearing in the codebase and reversing it costs an hour.
The five rules
Distilled, the methodology is five things:
- Write the brief like a PRD, because it is one. Don't reinvent. Your spec skills transfer.
- Add gating rules for the decision classes the agent shouldn't make alone. Be specific about which classes. Name the why.
- Write the negatives. What-NOT-to-do matters more than what-to-do, because it constrains the defaults the agent would otherwise reach for.
- Define the escalation protocol explicitly. Including "do not pick a default and continue."
- Resist scope drift inside the brief itself. v1 is v1. Future versions are future briefs. If the agent suggests a feature mid-build, it goes into a backlog file, not into v1.
That's the whole thing. Five rules. The shift from "vibecoding falls apart at v0.3" to "vibecoding ships v1, v1.1, v1.2 cleanly" is mostly contained in those five.
What this changes for you
If you're a PM in 2026, the question isn't whether AI changes your job. It's already changed it. The question is whether you're in the half of PMs who ship things directly now, or the half who still need an engineering team to materialise their ideas.
This isn't about replacing your engineers. They still build the systems that matter — the production infrastructure, the security model, the things that scale to millions of users. What changes is the long tail. The internal tool you've been asking for. The dashboard that would make your weekly review actually work. The prototype that would make next quarter's pitch tangible instead of theoretical. The small app you'd build for yourself if you could. The demo you need for next week's exec review and engineering can't get to until next sprint.
PMs who can ship that long tail pull ahead. They make better product decisions because they've actually built and used things. They escape the perpetual dependency on engineering bandwidth that defines most PM careers. They become the kind of PM who shows up to a meeting with a working prototype instead of a slide.
The methodology in this piece is what makes that shift sustainable. Without it, you ship one working app and then get burned by the next three and quietly give up. With it, you ship reliably.
What's next
I'm running a free live workshop on this in three weeks. We'll take an idea you bring — something small, something you'd actually use — and walk it through the entire methodology end to end. By the end of the session you'll have a handoff doc for your idea and a clear sense of whether to take it further.
Date and registration below. Bring an idea. Not a polished one. The smaller and rougher, the better.
If you're not ready for the live session, leave your email and I'll send you the recording when it's available, plus the agent handoff doc template I use myself.