The most common risk in AI-assisted development is not that the model cannot write code. It is that the model starts writing code too early.
A person says, “please fix this,” and the agent quickly reads files, guesses the cause, edits code, runs tests, and summarizes the result. That can look productive. But the important project questions have not been answered yet: is the request mature enough, and how far is the AI allowed to go?
That is why a project-specific AI delivery pipeline starts with task intake and execution mode routing, not implementation.
Chat is not a task contract
Chat is good for discussion. It is not a reliable execution contract.
A conversation can mix complaints, guesses, goals, background, temporary ideas, and real acceptance criteria. A human can often separate those layers. AI may not do it consistently.
A task contract compresses the discussion into an executable boundary.
A minimal contract should include:
- the user problem;
- the expected behavior;
- current evidence;
- acceptance criteria;
- explicit non-goals;
- known risks;
- the source of truth;
- stop conditions.
Without these fields, AI is improvising inside an ambiguous task.
Route before execution
After the task contract, the pipeline should choose an execution mode.
I use four modes.
auto-run is for low-risk, bounded, testable, reversible work: small UI copy, obvious bugs, documentation fixes, or local test additions.
manual-triage is for unclear work or anything that may touch money, permissions, security, schema, production state, or core business semantics. AI can analyze, but it should not implement by default.
guarded-full-speed is for complex but bounded work. AI can keep moving, but only inside project-specific stop conditions, evidence requirements, and human gates.
spike is for exploration. A spike produces decision material, not production-ready delivery.
These modes are not about how smart the model is. They are about how much execution authority the project is willing to grant.
Auto-run must be opt-in
Automation should not be on by default.
Issues, product discussions, and project notes contain many things that should not be executed automatically: user feedback, product ideas, incidents, compliance questions, and temporary notes.
Only explicitly opted-in tasks should let AI claim, analyze, implement, and open PRs automatically.
The opt-in is not just a label. It means the project has confirmed that scope, risk, and acceptance are clear enough for the pipeline to handle.
Stop conditions matter more than prompt advice
Many prompts say, “ask me if you are unsure.” That is too weak.
The project should encode stop conditions in the task contract or execution rules. Examples:
- acceptance criteria are missing;
- requirements conflict;
- core business semantics would change;
- production data would be modified;
- money, trading, permissions, security, or compliance is involved;
- the evidence gate fails;
- AI cannot find a reliable source of truth.
When these conditions appear, AI should not keep trying to finish. It should move the task to a state such as needs-input, review-required, or failed-needs-human.
This is not bureaucracy
Task contracts may look like process overhead. In practice, they reduce rework.
Without a contract, AI may produce a patch quickly, but you spend more time deciding whether it solved the real problem, expanded scope, or crossed a boundary.
With a contract, the work is narrower, clearer, and easier to validate. Even failure becomes useful because you can locate whether the failure came from requirements, context, implementation, tests, or evidence.
That is the first rule of the delivery pipeline: turn the request into an executable contract, then decide whether AI is allowed to act.

