Evaluating Agent Workflows: Traces, Cost, Recovery
Connects trajectory evaluation, goal-level cost, semantic recovery, and confidence calibration into a practical evaluation frame for agent workflows.
6 public items tagged agent-governance.
Connects trajectory evaluation, goal-level cost, semantic recovery, and confidence calibration into a practical evaluation frame for agent workflows.
Defines AI agent operational governance: the layer of authority, evidence, tools, traces, costs, and human gates that makes agents usable in real work.
Explains residual drift in long-running agent sessions and why serious workflows need commitment tracking, not just memory or contradiction checks.
Shows how evidence-carrying actions and source authority rules prevent fluent text from overriding structured data, tool output, or source evidence.
Explains why agent actions need runtime authority checks and autonomy gates, instead of relying only on plan-time approval or a confident task plan.
Treats MCP servers, plugins, tool descriptions, and dependencies as part of the agent control surface that needs governance and supply-chain review.