Summary
I hated writing spec documents. Sit down, open a markdown file, stare at the cursor for twenty minutes, then write something that still misses half the context I have in my head. A month ago I traded that for a workflow where Whisprflow dictates a brain dump into Claude Code, the agent reformulates it into a structured spec, and explicitly asks whether it understood me correctly. Ten minutes instead of an hour — and a spec with more detail than anything I'd type by hand.
Why I'm writing this
This workflow grew out of frustration. I wanted the agent to ship a feature I had clearly in my head, but I couldn't get it onto the page. I wrote a terse spec, the agent generated half-baked code, I added another requirement, the agent rewrote, missed the same edge case again. After the third iteration I figured out the problem wasn't the agent — the problem was my input.
If you want good code from an agent, you have to give it a good spec. And writing a good spec by hand is the kind of pain no senior engineer has the patience for.
The trap of the written spec
The moment I open markdown, several things happen at once, and all of them degrade the output:
- Writing slows my thinking. Every time I remember more context, I have to wait until I finish the sentence. By the time I'm done, a third of what I remembered has evaporated.
- The editor pulls me toward editing. I see a sentence that reads awkwardly and fix the word order instead of dumping the next thought. Form wins over substance.
- The result looks polished but is missing three edge cases. I didn't think of them while writing because my brain was hovering over phrasing, not the problem.
- Half-baked spec → half-baked code. The agent ships an implementation that does 70 % of what I wanted, and I spend another half hour explaining the remaining 30 %.
This is exactly the moment vibecoding becomes vibedebugging.
The workflow: three steps, ten minutes
The workflow has three parts. None of them is complex, none of them needs a special tool — Whisprflow can be replaced by anything that does dictation, Claude Code by anything that holds a conversation.
1. A prepared prompt in Claude Code
I have a prompt that flips Claude Code into spec interview mode. In that mode the agent doesn't write code. Instead it listens, summarizes, asks follow-up questions, and explicitly confirms understanding. It's like having a junior product manager who walks you through structure.
I'm not publishing the prompt verbatim in this post — I'll send it to anyone who asks at info@kontradigital.com. The reason: I tuned the prompt over several weeks, but its exact wording matters less than the principle behind it.
2. Dictating through Whisprflow
I open Claude Code in the terminal, run the prompt, and turn on Whisprflow. Then I just talk. I ramble, repeat myself, circle back to things I remembered two sentences later. The context flows straight from my head into text, with no editing filter in the way.
I use Whisprflow because it's fast, it doesn't need a context switch, and it runs in the background. It's not the only option — the principle of the workflow is spoken brain dump, not a specific tool. If you ask, I'd recommend Whisprflow, but macOS dictation or SuperWhisper work equally well.
3. The reflexive loop
This is the part where the workflow diverges from plain dictation into a markdown file. After the brain dump, the agent doesn't do what you'd expect — it doesn't save the text. Instead:
- It summarizes the problem in its own words.
- It writes bullet points: "I understand this as…", "I'm assuming…", "what I didn't catch: …".
- It asks: "Did I understand correctly?"
- It waits for yes / no / "almost, also X".
If I say no, I add the missing context out loud, and the loop repeats. When the agent verbalizes what it heard, it often surfaces gaps I'd never spot while writing — because the model doesn't have the implied context I have in my head, so it explicitly states what it's assuming.
After two or three loops I have a spec I've said yes to. That spec then becomes the input for the implementation phase (a fresh agent, or simply the next prompt in the same session).
What this looks like in practice
I recently used this to spec a feature for Skillsmith — placeholder variables in skill.md that get rewritten per vendor at export time (file reference syntax differs between Claude and Cursor).
The brain dump went something like:
"I have a file reference in a skill, but Claude wants
@path/to/file.tsand Cursor wants[file](mdc:path/to/file.ts). I don't want to maintain two versions, I have one source of truth. Probably a placeholder, something like{{file path/to/file.ts}}, and the adapter rewrites it. But I'm also wondering if there's some other syntax for this, maybe…"
After one round of "did I understand correctly?", the agent extracted:
- problem: vendor-specific syntax for file references
- proposed solution: placeholder syntax
{{file …}} - open question: should the placeholder cover more than files (URLs, anchors inside files)?
- what's missing: behaviour for invalid paths, escape syntax for literal braces
I'd have forgotten that last one if I'd been writing. The agent asked for it because it had to verbalize what it was assuming. After the second round the spec had a five-section structure and could go straight to implementation.
Why it works better than writing
Spoken language has higher throughput than written — I can pour out three times the context per minute than I can type. The agent acts as a rubber duck that talks back, surfacing implicit assumptions before they materialize in code. The validation loop forces me to explicitly confirm understanding instead of implicitly hoping. And the output is immediately usable as input for the implementation phase — no rewrite needed.
What you need to try it
- Claude Code (or Cursor, Codex — any agent that holds a conversation, not just code completion).
- Whisprflow (or any dictation tool — the principle is spoken brain dump, not a specific tool).
- A prompt that flips the agent into spec interview mode. I'll send mine on request: info@kontradigital.com.
Closing
Spec writing doesn't have to be misery. A spoken brain dump with a reflexive agent gives you a better input for AI implementation than an hour of hand-written markdown — and it doesn't drain the mental energy you need for the feature itself. Try it once. If the workflow clicks, you'll feel that hand-writing specs is as dated as writing code reviews in a plain text editor.