The most dangerous thing about an AI agent is not only that it can be wrong. It is that it can make uncertain claims sound coherent enough to act on.

In ordinary Q&A, that produces bad answers. In an agent workflow, fluent text can override real source evidence and trigger the wrong tool call.

Operational governance therefore needs a source-authority rule: when user text, model inference, structured data, tool output, and system state conflict, which one wins?

Fluent text is weak evidence

Natural language is useful for expressing intent. It is not automatically a source of truth.

Users may misremember. Project documents may be outdated. AI summaries may miss boundaries. Chat context may have expired. Model inference should not be treated as primary evidence.

In engineering workflows, sources should be layered:

runtime state: current system state, tool output, database query results, tests, logs, and actual file content.

structured source: configuration, schemas, manifests, frontmatter, API responses, and traces.

reviewed document: maintained specifications, decision records, product truth pages, and approved policy.

raw material: web pages, screenshots, meeting notes, user input, and unprocessed documents.

model inference: explanation, summarization, classification, and judgment generated by AI.

The hierarchy is not universal, but it must exist. Without it, the agent will often trust the most fluent, recent, or instruction-like text in context.

Evidence-carrying actions

Side-effecting agent actions should carry action-critical evidence.

Not every step needs a long evidence packet. But when an action changes code, data, publication state, external accounts, project rules, or long-lived knowledge, “I checked” is not enough.

A minimal evidence payload should answer:

  • Which source supports this action?
  • What path, timestamp, query, or tool output produced the source?
  • Which assumptions are known, and which are inferred?
  • Are there conflicting sources?
  • How can a reviewer reproduce or validate the claim?

In code work, evidence may be tests, logs, screenshots, API output, or diffs.
In knowledge publication, evidence may be source pages, publication manifests, bilingual siblings, and route checks.
In finance or trading systems, evidence must separate prediction, strategy, execution, account state, and risk constraints.
In multimodal modeling, evidence may include the original floor plan, a structured design state, bridge traces, and visual review.

The evidence type changes by project. The principle does not: side effects without evidence should not count as completion.

Do not auto-merge conflicts

Agents like to reconcile conflicts. When two sources disagree, the model often produces a smooth compromise.

That is dangerous.

If product documentation says A but the code shows B, the agent should not invent “the system supports both A and B.” If a user says something was deployed, but GitHub, Cloudflare, or a build trace cannot confirm it, the agent should not treat the statement as fact. If a headline suggests one conclusion but the original text, table, or date does not support it, the headline is not enough.

The better flow is:

mark conflict -> identify the higher-authority source -> ask or verify -> then act

If source authority cannot be resolved, the agent should stay in analysis mode instead of turning conflict into certainty.

What this means for knowledge publication

A central knowledge base needs source authority.

A concept page may include reasoning, but it must trace back to source pages, project evidence, or clearly marked inference. A public article can be more expressive, but it should not present unverified speculation as fact. The site is a publishing node, not the source of truth.

That is why a publication adapter should not simply copy text. It should preserve source identity, source hashes, translation keys, review status, and generated paths. The public article can read naturally, but the system behind it must be able to reconstruct the source trace.

Core judgment

The stronger an agent’s language ability becomes, the lower the evidentiary weight of language itself should be.

In real workflows, fluent text should serve evidence, not replace it. An agent that can show why it claims something, where the evidence came from, and how conflicts were handled is much closer to being usable in high-trust work.