How to Verify an Agent Harness

An agent harness cannot be verified only through a live demo.

A live demo proves that one path worked once. It does not prove that the product can be installed, run in a clean project, load runtime skills, find bridge files, generate traces, or handle host-application blockers.

A product like SAH needs layered verification.

Why headless smoke matters

SketchUp is a desktop application. If every validation step requires opening live SketchUp, development, CI, and regression checks become slow.

Most product logic should therefore be verifiable headlessly:

project initialization;
schema and manifest validation;
project state inspection;
design-rule merging;
component metadata loading;
bridge trace planning;
import manifest and evidence structure;
packaged runtime asset presence.

Headless smoke does not replace the live bridge. It moves the parts that do not require the host application earlier in the verification chain.

Live bridge checks solve a different class of problems

Some problems only appear with the host application running.

Examples include whether the SketchUp bridge is actually loaded, whether the local bridge transport is available, whether SketchUp is blocked by a welcome screen, account prompt, update prompt, or license issue, whether the Ruby bridge can execute geometry operations, whether operation responses include entity ids and spatial deltas, and whether rollback or errors return structured information.

These checks are expensive, so they should not carry all verification. But they must exist, or the product may only be correct in a headless world.

Source checkout success is not user success

Many local agent tools work in the source checkout and fail after packaging.

The cause is often not core logic but boundary assets:

runtime skills missing from the package;
bridge files not included;
plugin metadata pointing at source paths;
templates or static assets missing;
CLI entry points unavailable after install;
startup checks depending on maintainer directory structure.

If the product promises designers that they do not need to clone the source repository, then the installed package path is closer to the real user path.

Release smoke should cover the installed shape

Release smoke does not need to prove every feature is perfect. It needs to prove that the published artifact has not broken the key paths users will touch.

For a product like SAH, that means checking package build, CLI entry points, packaged runtime skills, packaged Ruby bridge, plugin metadata, project initialization, headless validation, bridge install dry run, release check, and installed package check.

That is closer to user experience than unit tests alone.

Product facts still belong to the repo

This article is not a command tutorial, so it does not list full command sequences. Exact commands, flags, and environment assumptions should come from the current product README, DEVELOPMENT guide, and capability map.

The reason is simple: verification commands change quickly. Once something becomes a tutorial, it must be run against the current product state. Otherwise publication becomes stale installation guidance.

For a developer architecture article, the more durable point is the verification layering:

unit / contract
-> headless smoke
-> bridge install dry run
-> live bridge integration
-> release smoke
-> installed package smoke

This pattern generalizes

Any local agent harness that ships runtime skills, plugins, a host bridge, templates, and CLI entry points will face similar issues.

Working from source is the maintainer path. Working after install is the user path.

If the product promise is “start the agent in your own project directory,” verification has to look from the installed package and user-project perspective, not only from the source repository.

Source trace

sketchup-agent-harness:README.md
sketchup-agent-harness:DEVELOPMENT.md
sketchup-agent-harness:docs/product/capability-map.md
Installed Package Release Smoke
Project Workspace As Agent Operating Surface